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HUMAN  FACTORS  IN  FIELD  TESTING 


ABSTRACT 


Ibis  study  examined  the  problem  of  human  factors  field 
evaluation  within  the  Navy  for  the  purpose  of  formulating 
recommendations  for  improved  evaluative  methods  and  techniques. 
Conclusions  and  reocnmendatians  were  drawn  from  review  of 
relevant  general  literature  and  Navy  documents,  interviews  and 
discussions  with  individuals  experienced  in  human  factors 
evaluation,  and  an  examination  of  the  design  and  evaluation 
procedures  carried  out  during  the  development  and  test  of  a 
specific  Navy  aircraft  system. 

Conclusions  and  recommendations  are  enumerated  in  Section 
6.0  of  the  report  along  with  references  to  the  relevant  supporting 
Secticn(s)  within  the  body  of  the  report. 

Briefly,  it  was  ocncluded  that  human  factors  evaluation  does 
not  receive  emphasis  or  support  comparable  to  that  given  equipment 
evaluation  or  commensurate  with  the  importance  of  the  human 
operator  to  the  successful  functioning  of  the  system.  Much  more 
definitive  and  timely  information  must  be  provided  the  human  factors 
field  evaluator,  evaluations  must  be  more  mission  oriented  rather 
than  cockpit  centered,  the  role  of  the  mock-up  inspection  needs 
redefinition,  assignment  of  trained  Navy  human  factors  personnel 
to  advise  and  assist  the  contractor  during  development  is  recommended 
as  is  assignment  of  contractor  human  factors  personnel  during  field 
evaluations,  close  cooperation  between  human  factors  and  equipment 
design  evaluation  personnel  during  field  evaluation  will  greatly 
increase  the  effectiveness  of  the  evaluation,  and  a  short  intensive 
training  oourse  in  human  factors  evaluation  problems  and  methods  is 
recommended  for  Navy  personnel  assigned  to  system  evaluation. 
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1.0  PROBLEM  SUMMARY 


This  study  was  initiated  in  response  to  the  expressed  need  for 
a  practical  and  reliable  means  of  hunan  factors  evaluation  during 
the  field  testing  of  ocmplex  weapons  systems.  Its  objective  was 
the  production  of  practical  and  concrete  suggestions  for  meeting 
this  need. 

From  the  initial  broad  statement  of  the  problem  the  study  was 
narrowed  to  focus  upon  the  development  of  hunan  factors  test  and 
evaluation  techniques  for  operators  of  aircraft  systems.  The 
decision  to  focus  attention  upon  these  systems  was  based  upon 
deliberations  as  to  how  the  study  might  be  mo6t  effectively  carried 
out.  Aircraft  systems  were  chosen  as  most  suitable,  particularly 
as  initial  systems  for  study,  for  the  following  reasons: 

1.  They  are  najor  systems  within  the  Navy. 

2.  They  are  complex  in  that  they  embody  complex  subsystems. 

3.  Hman  factors  considerations  are  heavily  involved  in 
tiieir  operation. 

4.  Hunan  factors  specifications  and  requirements  are  laid 
on  their  development,  test  and  evaluation. 

5.  They  involve  subsystems  whose  test  and  evaluation 
procedures  may  be  generalizable  to  other  Navy  systems. 

In  order  to  study  the  human  factors  test  and  evaluation  process 
at  first  hand  and  to  base  reocmnendaticns  upon  knowledge  of  the 
actual  workings  of  the  RDT6E  process  a  specific  aircraft  system 
was  chosen  for  detailed  study.  The  particular  system  chosen,  the 
A-7A  aircraft,  was  selected  sinoe  its  recency  in  completing  the 
ROT6E  process  suggested  that  it  was  representative  of  current 
practice  and  personnel  knowledgeable  about  the  system  were  available 
for  interview  and  discussion. 


As  the  work  progressed  the  human  factors  efforts  for  the  A-7E 
and  the  P-3C  aircraft  were  examined  for  particular  information  about 
analyses  and  timing  of  hunan  factors  reports. 

The  detailed  Btudy  of  the  A- 7 A  was  supplemented  by  an  extensive 
review  of  formal  and  informal  reports  and  writings  relevant  to  the 
problem  and  by  interviews  with  human  factors  personnel,  administrators 
and  evaluation  project  personnel .  A  reference  list  of  the  reports 
found  most  helpful  are  given  in  the  Bibliography  of  this  report. 


2.0  HUMAN  FACTORS  EVALUATION  -  A  FRAME  OF  REFERENCE 


2.1  TOTAL  SYSTEM  OR  COMPONENT  EVALUATION 

In  considering  the  field  evaluation  of  a  particular  system  one 
is  likely  to  assume  that  the  total  system  is  in  seme  way  evaluated 
with  respect  to  its  capabilities  For  achieving  some  specified  goal 
or  mission.  Under  this  assurpticn  one  might  conclude  that  the  field 
test  of  the  system  is  a  simple  matter  of  determining  on  a  "go-no  go" 
basis  whether  or  not  the  system  meets  the  mission  criteria.  If 
such  were  the  case,  specific  conoem  with  evaluation  of  the  human 
component  (indeed  the  evaluation  of  any  ether  component  of  the 
system)  would  seem  to  be  unnecessary  busy  work  when  carried  out  in 
connection  with  a  field  test  of  the  system. 

The  assumption  that  hunan  factors  evaluation  is  a  neoessary 
and  somehow  distinctive  part  of  the  field  evaluation  required  some 
justification  and  clarification  -  at  least  to  this  investigator. 

An  answer  seemed  to  be  needed  to  the  question  as  to  why  one  should 
be  ocnoemed  with  the  minutiae  of  evaluating  components  and  sub¬ 
systems  of  a  system  if  the  evaluative  decision  is  one  of  "acoept" 
or  "reject"  the  total  system.  Presumably  if  the  total  system 
successfully  accomplishes  its  design  mission  we  would  have  no 
interest  in  measuring  the  performance  of  any  of  its  oorponents  - 
human  or  hardware.  If  the  system  meets  its  mission  criteria  we 
might  assume  that  the  hardware  and  hunan  oorponents  are  functioning 
so  as  to  bring  about  total  mission  success. 

As  it  turns  out  there  are  practical  and  cogent  reasons  why 
component  and  subsystem  evaluation  is  neoessary.  However,  the 
conclusion  has  been  reached  by  this  investigator  that  evaluations  go 
on  in  this  somewhat  piece-meal  way  without  a  clear  statement  as  to 
why  they  do,  why  it  is  neoessary  that  they  do  and  what  is  required 
in  order  for  them  to  be  more  effective  under  such  a  system  of  operation. 
This  applies  particularly  to  evaluations  of  the  hunan  oorpenent  of 
the  system. 

The  reasons  for  conducting  ccnpcnent  and  subsystem  evaluation 
rather  than  an  overall  system  evaluation  for  most  systems,  when 
examined  closely,  tend  to  bring  into  focus  the  problems  of  and  the 
requirements  for  carrying  out  an  adequate  human  oonponent  field 
evaluation.  It  is  hoped  that  the  following  paragraphs  help  to 
crystallize  the  problem  and  form  a  basis  and  a  rationale  for  the 
reoonmendatiens  given  later. 

2.2.1.  The  Nature  of  the  Development  1*100653 

The  first  important  reason  and  necessity  for  being  concerned 
with  component  testing  during  field  evaluation  lies  in  the  nature 
of  the  development  process  itself.  In  theory,  at  the  beginning  of 
the  development  cycle  the  mission  of  the  system  is  delineated  in 
detail  with  criteria  for  successful  performance  clearly  spelled  out. 

Many  considerations  mitigate  against  such  a  clear  delineation. 
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When  a  system  is  developed  to  execute  a  new  mission  or  to 
extend  the  capabilities  for  executing  a  present  mission,  the  details 
of  the  system  and  its  performance  criteria  cannot  be  stated 
definitively  at  the  outset.  Generally,  rather  explicit  overall 
system  criteria  are  established  to  be  attained  through  application 
of  the  present  state  of  the  equipment  art  or  the  projected  state 
of  that  art.  However,  details  of  the  mission  and  the  performance 
criteria  develop  with  the  development  of  the  system  -  or  perhaps 
more  precisely,  with  the  development  of  the  hardware  for  the  system. 

As  the  iterative  process  of  design  proceeds  the  details  of  system 
components  and  their  requisite  individual  performance  criteria 
emerge.  The  overall  goal  or  mission  of  the  system  is  broken  down 
into  intermediate  or  secondary  goals  for  subsystems  and  components 
of  the  total  system.  The  attainment  of  these  intermediate  or 
secondary  goals  by  the  components  and  subsystems  are  intended  to 
sunmate  into  attainment  of  the  overall  goal  by  the  system. 

During  the  development  process  a  system  is  being  synthesized 
from  components  chosen  after  an  analytical  exercise  in  which  the 
total  system  requirements  have  been  broken  down  into  subsystem  and 
component  requirements.  Actually  the  processes  of  both  analysis 
and  synthesis  go  on  in  iterative  fashion  throughout  the  development 
period.  The  central  point  to  be  made ,  however,  is  that  system 
synthesis  is  attained  through  selection  of  components  and  sub¬ 
systems  which  can  perform  in  accordance  with  the  requirements  made 
explicit  by  the  analysis.  Components  are  chosen  (1)  whose  input- 
output  characteristics  match  adjacent  components,  (2)  which  perform 
the  proper  transformations  an  the  input  and  (3)  which  perform  their 
proper  function  within  the  required  time.  Component  and  subsystem 
performance  sumnate  to  total  system  performance.  Che  needs  only  to 
reflect  on  the  process  of  synthesizing  a  simple  audio  circuit  to 
understand  how  the  incompatibility  of  one  ocmpcnent  can  lead  to 
total  system  failure. 

When  system  development  is  viewed  as  a  process  of  synthesis  of 
compatible  components  it  is  obvious  that  the  human  as  a  component 
should  be  treated  with  no  less  respect  than  the  components  he  operates  - 
the  black  boxes.  It  should  be  obvious  also  that  his  performance 
reouirements  within  the  particular  system  must  be  considered  from  the 
very  beginning  of  the  development  process. 

Field  evaluation  of  a  system  takes  its  cue  from  the  development 
phiiosophy.  For  most  systems  total  system  effectiveness  in  field 
evaluation  is  an  estimate  based  upon  evaluations  of  components  and 
subsystems  of  the  system.  Most  field  evaluations,  therefore,  are 
not,  and  probably  cannot  be,  evaluations  of  the  system  in  toto  in 
its  intended  operational  environment.  Father  they  are  evaluations 
of  particular  components  and  subsystems  as  they  operate,  in 
combination  with  other  parts  of  the  system  in  an  environment  more 
or  less  representative  of  the  intended  operational  environment  of 
the  system.  The  tested  components  (both  hardware  and  human)  are  intended 
to  be  representative  of  those  which  will  finally  comprise  the  total 
system.  These  intentions  are  often  only  approximately  realized. 
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In  field  evaluation  we  have  the  dual  problem  of  choosing  (1) 
the  proper  ccnpcnents  and/or  subsystems  for  test  and  (2)  the 
representative  environmental  conditions  under  which  to  test  them  - 
this  testing  being  designed  to  provide  an  accurate  estimate  of 
how  well  the  total  system  will  function  in  its  operating  environment. 

The  fact  of  total  system  development  being  a  synthesizing 
process  and  evaluation  being  the  testing  of  oocpcnents  and  sub¬ 
systems  justifies  laying  emphasis  on  a  point  made  by  other  writers 
and  one  which  will  be  discussed  at  length  in  this  report.  That 
is,  in  order  to  properly  evaluate  a  component  of  the  total  system 
it  is  necessary  that  the  evaluator  know  explicitly  the  role  of 
that  component  in  the  total  system.  The  huian  operator  is  such  a 
oomponent.  The  human  factors  evaluator  must  know  what  the  operator 
must  do,  how  well  he  must  do  it  and  under  at  environmental 
conditions.  This  determination  cannot  be  xeft  to  last  minute 
speculations  by  the  evaluator  in  the  field. 

2.2.  The  Trouble-Shooting  Aspect  of  Evaluation 

A  second  reason  and  necessity  for  evaluation  at  the  subsystem 
and  component  level  comes  about  when  a  particular  chain  of  components 
or  subsystems  are  tested  arid  the  performance  fails  to  meet  the 
standard.  Under  these  circumstances  it  is  necessary  to  determine 
which  subsystem(s)  or  oonponent(s)  failed  to  perform  to  their 
particular  criteria.  The  human  factors  evaluator  is  interested 
in  determining  whether  the  human  conpcnent  f idled  and,  if  so,  in 
what  way. 

These  circumstances  require  that  information  on  performance 
of  subsystems  or  ocnponents  be  obtained  in  order  to  pin-point  or 
diagnose  the  source  of  the  difficulty.  This  information  must  be 
obtained  through  a  systematic  and  reliable  means  suitable  for 
identifying  the  trouble  spot  within  the  larger  unit  after  the 
larger  unit  has  failed.  Thus,  there  is  a  trouble  shooting  aspect 
to  field  evaluation  which  must  be  made  more  systematic  and  reliable 
with  respect  to  human  factors  evaluation. 

2.2.3  The  Adaptive  Human  Conpcnent 

A  third  reason  for  oonpcnent  evaluation  is  peculiar  only  to  the 
human  oomponent.  It  appears  to  be  not  musual  during  field 
evaluations  that  the  ingenuity  and  adaptability  of  the  human 
operator  enables  him  to  perform  in  a  way  which  results  in  system 
success  even  though  his  actions  and  performance  may  have  been  quite 
different  from  those  anticipated  by  the  designer.  In  such  situations 
his  acting  in  the  manner  in  which  the  designer  had  envisioned  may 
not  have  been  within  his  capabilities  and  caused  him  (and 
sequent ly  the  system)  to  fail.  It  is  this  adaptability  and 
ability  to  recover  which  characterizes  the  human  conpcnent  of  the 
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system,  makes  human  factors  evaluation  (or  data  ool lection) 
inport  ant  in  all  systems,  and  which  differentiates  the  hunan 
ocrpcnent  evaluation  from  that  of  other  corponents  of  the  system. 

A  method  for  determining  when  and  how  the  operator  has  performed 
in  this  adaptive  way  is  necessary  for  guiding  re-design,  procedural 
changes  or  training. 
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3.0  CRITERIA  AND  MEASUREMENT 


3.1  EVALUATION  AGAINST  A  STANDARD  VS.  MEASURES  OF  ABSOLUTE  PERFORMANCE 

In  evaluating  a  system,  component  or  subsystem  standards 
against  which  to  evaluate  are  necessary.  In  evaluating  the  human 
component's  performance  in  a  given  system  the  test  question  is  not 
how  well  he  performs  in  an  absolute  sense,  but  rather  whether  he 
performs  well  enough  to  meet  the,  standards  set  by  the  system. 

However,  due  to  the  paucity  of  human  operator  performance  data, 
absolute  measures  of  performance  should  be  obtained  at  every 
opportunity  irrespective  and  independent  of  the  operator's 
performance  relative  to  the  standards  of  the  system  being  evaluated. 
Absolute  measures  are  necessary  to  provide  data  on  the  hunan 
ccnpcnent  for  use  in  the  synthesis  of  future  systems.  Collection 
of  this  data  will  continue  to  be  necessary  until  an  adequate  bank 
of  human  performance  desi^i  data  is  established. 

This  section  discusses  these  two  aspects  of  performance 
criteria,  i.e.,  that  of  evaluating  with  respect  to  the  standard 
aid  that  of  measuring  absolute  level  of  performance. 

3.2  DERIVATION  OF  PERFORMANCE  STANDARDS 

As  pointed  out  in  Section  2.0  the  synthesis  of  a  system  requires 
the  assembly  of  components  which  meet  certain  requirements  as  to 
their  input-output  characteristics  and  time  constraints,  and  which 
will  operate  satisfactorily  in  the  operational  environment. 

The  systems  engineer,  in  synthesizing  a  system,  must  be  explicit 
in  stating  the  inputs,  outputs,  and  data  transformations  required 
of  each  component  in  order  for  the  system  to  work.  Presurably  this 
would  also  be  done  for  the  hunan  ocoponent  thereby  setting  standards 
or  criteria  for  his  performance. 

However,  let  us  examine  the  task  of  the  system  engineer  in 
designing  a  system  in  which  he  must  deal  with  the  human  as  a  component. 
He  is  faced  with  the  task  of  fitting  into  the  system  a  component 
about  which  he  has  little  information  as  to  its  exact  capabilities 
for  handling  data.  His  guidelines  are  often  general  and  his  data 
are  subjective  extrapolations  from  minimal  information.  He  nay 
know  that  similar  operators  have  performed  similar  functions 
adequately  in  other  systems.  He  nay  have  available  to  him  the 
judgments  of  human  engineers  as  to  whether  proposed  inputs  to, 
outputs  from,  and  transformations  by  the  human  component  are 
within  its  capabilities  for  handling  them.  He  also  has  faith  in 
the  adaptability  (or  trainability)  of  the  hunan  component  to  "oome 
through"  and  be  able  to  perform  adequately.  At  present  the  latter 
may  well  be  his  strongest  weapon. 
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The  lack  of  specificity  about  the  hunan  operator's  capabilities 
and  the  designer's  faith  in  human  adaptability  often  leads  to  a 
laxness  in  stating  specifically  the  inputs  to  him  and  the  outputs  he 
rrust  deliver.  More  commonly,  it  is  known  explicitly  what  is  required 
of  a  chain  or  combination  of  ccrpcnents  one  of  which  is  the  human. 

It  has  not  been  customary  therefore  to  enunerete  explicitly  the 
data  input-output  requirements  of  the  operator  as  a  guide  and  a 
const  mint  in  designing  a  system.  The  result  has  been  specific 
infomation  about  hardware  performance  requirements  and  capabilities 
and  the  creation  of  a  rather  amorphous  slot  between  two  pieoes  of 
hardware  into  which  the  human  operator  rust  fit.  The  required 
standard  of  performance  for  the  item  of  equipment  immediately 
downstream  from  the  operator  is  known  and  it  is  assumed  that  the 
operator  will  perform  adequately  without  being  specific  as  to  the 
level  of  performance  required. 

The  reader  may  take  exception  to  the  latter  statements  as 
over-reaching  the  facts.  Such  exception  is  justifiable  to  the 
extent  that  there  are  some  operator  inputs  and  outputs  which  can 
be  specified  for  the  system  and  far  which  the  hunan  capability 
for  handling  them  can  be  stated.  These  tend,  however,  to  be  more 
in  the  nature  of  minutiae  such  as  dial  legibility,  knob  size,  etc. 
and  concern  with  them  has  been  characterized  as  "knobs  and  dials" 
human  engineering.  Attention  to  these  details  in  design  is 
neoessary  and  their  inport  an  oe  is  not  denied.  However,  it  cannot 
be  denied  also  that  there  are  a  significant  number  of  requirements 
placed  on  the  human  operator  by  the  system  and  fulfilled  by  him 
which  are  more  complex  than  dial  reading  and  knob  turning.  It  is 
with  these  that  the  human  engineer  has  trouble  when  he  attempts  to 
be  explicit  about  which  inputs  the  operator  must  and  can  act  upon, 
in  what  sequence ,  what  integrative  process  is  required  and  used, 
and  what  the  adequacy  of  the  output  will  be. 

As  an  aside,  in  this  paper  concern  with  knobs  and  dials, 
legibility,  panel  layout  and  the  like  will  be  spoken  of  as  cockpit 
human  engineering  and  evaluation.  Concern  with  the  larger  problem 
of  hunan  performance  requirements  and  capabilities  as  they  relate 
to  accomplishment  of  a  mission  by  a  system  will  be  spoken  of  as 
mission-system  hunan  engineering  and  evaluation.  More  will  be  said 
of  these  two  later. 

Does  this  amorphism  about  the  human  ccrpcnents  functions  mean 
that  standards  cannot  be  set  and  performance  evaluated?  The  answer 
is  that  his  performance  can  be  and  is  evaluated  inferentially 
when  it  cannot  be  evaluated  directly.  This  comes  about  from  the 
building  block  nature  of  the  synthesizing  process  in  which  components 
function  in  sequential  fashion  and  s inmate  into  subsystem  and 
system  functioning.  The  validity  of  inference  as  to  the  performance 
of  the  hunan  oenpenent  is  a  function  of  the  number  and  variability 
of  the  equipment  units  between  the  point  of  his  output  and  the 
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point  of  measurement.  The  standards  for  operator  performance 
are  the  standards  set  for  the  equipment  items  into  which  he  makes 
inputs.  This  is  illustrated  in  Figure  1  and  dll  be  helpful  for 
the  human  factors  evaluator  to  bear  in  mind. 


Figure  1.  Test  points,  Tj  and  at  which  measurements  of 
operator  performance  may  be  taken. 


In  point  of  fact  most  human  performance  data  in  dynamic  closed 
loop  systems  is  inferential  in  the  sense  that  the  output  of  seme 
unit  of  equipment  as  it  functions  in  response  to  the  operator's 
input  is  measured  and  the  human  performance  inferred  from  knowledge 
of  the  equipments'  characteristics.  In  evaluating  human  performance 
in  aircraft  systems  the  evaluation  is  particularly  dependent  upon 
inferred  performance  for  many  operator  tasks.  It  is  inport ant 
therefore  that  the  human  factors  evaluator  have  precise  knowledge 
about  the  criterion  performance  required  of  the  chain  or  combination 
of  elements  being  tested,  of  which  the  human  component  is  a  part, 
and  from  which  he  desires  to  infer  the  adequacy  of  operator  performance. 
It  is  equally  important  that  he  know  the  characteristics  of  the 
equipment  into  which  the  operator  is  making  inputs  and  from  which 
measurements  are  being  taken.  He  must  have  an  assessment  of  the 
variability  contributed  by  the  equipment  to  the  combined  operator  - 
equipment  output  in  order  to  assess  that  contributed  by  the  operator. 

To  recapitulate,  two  conditions  of  hunan  operator  evaluation 
are  present  in  the  aircraft  system.  First,  that  condition  in  which 
the  performance  of  the  operator  can  be  compared  to  the  required 
standard  (or  measured)  directly.  These  are  rare  for  in  a  strict 
sense  no  human  performance  can  occur  in  the  absenoe  of  some  equipment 
interfaoe  with  the  human  either  on  the  input  side,  the  output  side 
or  both.  Accuracy  in  reading  information  from  instruments  as 


8 


measured  by  verbal  report  or  time  to  actuate  a  given  control  are 
examples  of  test  conditions  in  which  direct  measurements  can  be 
taken.  These  performances  are  observed  at  Ti  in  Figure  1.  More 
ooirmcnly  the  human  performance  is  inferred  from  observing 
performance  of  a  series  or  chain  of  elements  as  shown  at  Tj  in 
Figure  1.  Performance  is  observed  at  T2  and  inferences  as  to 
human  performance  are  dependent  upon  knowledge  of  the  equipment 
elements  in  the  chain. 

It  is  apparent  that  identification,  description  and  priority 
setting  of  test  points  requires  a  detailed  knowledge  of  the  system 
design.  Sinoe  many  iterations  of  the  design  nay  occur  during 
development,  test  point  description  and  priorities  must  be 
continuously  updated.  The  test  points  must  be  made  explicit  by 
the  contractor’s  human  factors  personnel  and  ocnveyed  to  the  field 
evaluation  personnel.  Further,  since  field  evaluation  requires 
advance  planning  this  information  should  be  ocnveyed  to  evaluation 
personnel  on  a  continuing  basis  and  begin  probably  not  later  than 
the  initiation  of  the  preliminary  evaluation  phase.  The  specific 
point  during  development  at  which  the  human  factors  field  evaluator 
should  begin  receiving  this  information  cannot  be  pin-pointed  exactly. 
It  will  vary  with  the  system,  its  uniqueness,  its  complexity,  etc. 
However,  it  is  certain  that  it  cannot  be  left  to  be  determined  by  the 
field  evaluator  alcne  just  prior  to  reception  of  the  system  by  the 
test  facility  as  is  the  case  with  mo6t  systems  at  present. 

The  specific  test  points  and  the  required  levels  of  performance 
must  be  identified  throu^i  the  system  analysis  and  synthesis  carried 
on  during  development.  As  the  system  design  becomes  more  firm  and 
the  hardware  characteristics  become  fixed,  operator-equipment  inter¬ 
faces  must  be  identified  and  those  output  points  selected  from  which 
operator  performance  can  be  inferred  through  measurement  of  equipment 
outputs. 

3.3  ABSOUm:  level  of  performance  measurement 

Performance  data  obtained  during  field  evaluations  must  serve 
both  the  need  of  evaluating  a  particular  system  and  that  of  providing 
data  useful  in  the  design  of  future  systems.  Therefore,  it  is 
important,  both  for  the  evaluator  in  assessing  his  data  and  for 
future  use  of  the  information,  that  the  characteristics  of  the 
equipment  with  which  the  operator  interfaces  during  the  test  be  known 
and  reported.  It  is  one  of  the  weaknesses  in  the  data  on  human 
performance  that  there  is  incomplete  knowledge  about  the  specific 
characteristics  of  machines  and  equipment  with  which  the  data  were 
collected  and  which  correlate  with  performance .  In  most  cases, 
the  human  engineer  cannot  be  as  exact  about  what  parameters  of  a 
pieoe  of  equipment  will  affect  operator  performance  as,  for  example, 
the  aeronautical  engineers  can  be  about  what  parameters  will  affect 
the  stability  of  his  aircraft.  It  is  therefore  incumbent  upon 


9 


the  hunan  engineer  during  design  and  the  hurian  factors  evaluator 
during  system  test  to  emmerate  and  describe  the  parameters  of  the 
equipment  and  conditions  of  test  if  the  hurran  performance  infor¬ 
mation  obtained  is  to  be  useful  for  future  design. 

3.4  AN  EVALUATION  MISSION 

In  planning  the  field  evaluation  and  delineating  test  points, 
standards  of  performance  at  these  points  must  be  set.  However, 
these  standards  nay  differ  depending  upon  the  overall  mission  being 
undertaken  and  by  segment  of  a  mission.  It  is  necessary  then  to 
formally  define  the  evaluation  mission  in  order  to  arrive  at  test 
point  standards. 

In  setting  up  an  evaluation  mission  it  is  suggested  that  it  be 
built  around  the  primary  mission  of  the  system  with  enphasis  upon 
the  weapons  delivery  phase.  While  test  points  should  be  identified 
during  the  more  common  phases  of  the  mission  such  as  launch,  climb 
out,  cruise,  loiter,  etc.,  for  the  types  of  aircraft  with  which 
we  are  concerned,  the  primary  enphasis  and  priority  should  be  upon 
that  phase  of  the  mission  in  which  the  enemy  is  encountered  or  the 
target  attacked.  Special  problems  may  arise,  of  course,  with  less 
conventional  aircraft  which  demand  special  priority  but  even  there 
the  primary  weapons  delivery  role  of  the  system  should  be  emphasized. 

In  selecting  the  evaluation  mission  a  suggested  further 
consideration  is  that  "worse  case"  conditions  be  represented  within 
it.  By  worse  case  conditions  is  meant  those  conditions  in  which 
the  greatest  demands  are  placed  upon  the  hunan  operator  either  as 
to  level  of  performance  or  time  pressure.  Further,  as  a  guide  to 
the  selection  of  the  evaluation  mission  consideration  should  be 
given  to  the  type  of  mission  whit*  will  be  flown  in  the  systems' 
most  likely  theater  of  operation. 

The  evaluation  mission  must  then  be  broken  down  into  segments. 
The  functions  carried  out  during  each  segment  and  the  equipment 
involved  must  be  identified  from  the  systems  and  task  analyses 
carried  out  during  development.  Those  functions  in  which  the 
human  oanpcnent  is  involved  are  then  specified.  Decisions  must  then 
be  reached  as  to  the  relative  importance  of  the  functions  to 
mission  accomplishment  and  the  test  points  identified  for 
obtaining  performance  which  will  give  the  most  valid  picture 
(either  directly  or  inferred)  of  the  operator’s  perf orrrance . 

3.5  ASSIGNING  PRIORITIES  TO  TEST  POINTS 

Identification  and  description  of  test-points  are  critical  to 
human  factors  evaluation.  However,  since  it  is  not  practically 
possible  for  measurements  to  be  taken  at  every  identifiable  point 
a  priority  must  be  set  up.  Two  bases  for  priority  are  necessary. 
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Che,  a  priority  system  is  needed  which  ranks  the  relative 
importance  of  the  test  point  performances  with  respect  to  their 
effect  upon  overall  mission  success.  Second,  a  system  which  ranks 
the  test  points  relative  to  the  degree  to  which  assumptions  have 
been  trade  about  h  jnan  performance  during  design  is  neecled.  Those 
cases  in  which  the  least  data  were  available  on  which  to  predict 
hunan  performance  when  he  was  designed  into  a  given  slot  should  be 
given  highest  pi'iority  in  testing  in  order  to  test  the  assumptions. 
Parenthetically ,  these  points  in  particular  should  be  tested  as 
early  in  the  development  prooess  as  possible. 

The  relative  importance  of  the  several  functions  to  mission 
accomplishment  may  be  obtained  by  the  method  described  by  Rook 
(1964)  using  rank  order  techniques.  For  guidance  in  t)«  use  of 
rank  order  and  paired  comparison  methods  the  evaluator  should  have 
at  hand  a  copy  of  Guilford,  1954.  Rook  describes  his  procedure 
as  that  of  describing  concisely  on  separate  cards  each  of  the 
events  in  the  group  to  be  evaluated.  For  paired  comparison  the 
cards  can  be  presented  in  pairs  and  the  judge  asked  to  make  a 
judgment  as  to  which  is  the  most  and  which  is  the  least  important 
to  mission  success.  If  the  number  of  events  to  be  ordered  is  too 
large  for  the  paired  comparison  method  to  be  feasible  the  cards 
may  be  sinply  ranked  from  most  to  least  inport  ant. 

A  method  for  ranking  such  events  which  the  present  writer 
has  foixid  feasible  for  use  in  the  field  and  which  captures  the 
judges  interest  combines  the  paired  comparison  and  rank  order  methods. 
It  is  reoomnended  particularly  when  the  number  of  events  to  be 
judged  is  large.  With  this  method  the  judge  selects  an  event 
(a  card)  which  he  feels  lies  at  some  point  near  the  middle  of 
the  range  of  events  to  be  judged.  He  then  compares  each  card 
with  this  reference  card  and  sorts  the  cards  into  two  decks  one 
containing  those  events  judged  to  be  more  important  than  the 
reference  event  and  one  containing  these  judged  to  be  less  important. 
Each  of  the  resulting  decks  are  then  sorted  in  the  same  way. 

This  sorting  procedure  is  carried  on  until  each  deck  is  small 
enough  for  the  events  within  the  deck  to  be  conveniently  ranked 
within  the  deck,  i.e . ,  2  to  5  cards.  The  result  is  a  ranking 
from  most  to  least  important  of  all  events  within  the  total  deck. 

It  will  be  noted  that  this  latter  technique  allows  the  judge  to 
nake  a  comparative  judgment  and  forces  transitivity  on  the  scale 
resulting  from  judgments  of  the  attribute.  It  then  avoids  the 
possibility,  as  is  the  case  in  paired  comparison,  in  which  A  may 
be  judged  greater  than  B,  B  greater  than  C,  and  C  greater  than  A. 

The  serious  student  of  scaling  methods  should  consult  Torgerscn 
(1960)  as  well  as  Guilford  (1954). 

No  ranking  or  ordering  procedure  can  result  in  inter- judge 
reliability  without  a  clearly  understood  statement  of  the  underlying 
attribute  being  judged.  Thus  the  importance  of  defining  what  is 
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meant  by  "importance  to  mission  success"  cannot  be  over  enphasized. 
Rating  methods  and  scaling  techniques  cannot  be  discussed  at  length 
here  but  the  necessity  for  their  being  taken  up  in  indoctrination 
courses  for  human  factors  field  evaluators  is  emphasized  in  the 
course  outline  given  in  Appendix  B. 

3.6  HUMAN  ENGINEERING  PRINCIPLES  AND  HANDBOOK  DATA  AS  CRITERIA 

The  criteria  mo6t  used  during  design  and  evaluation  are  human 
factors  design  principles  and  data  contained  in  various  Human 
Engineering  Handbooks.  The  applications  of  these  principles 
and  data  throughout  the  development  period  and  throughout  evaluation 
serves  a  useful  purpose.  Their  usefulness  decreases,  however,  as 
the  system  moves  toward  completion  and  dynamic  full  system 
evaluation.  The  blind  application  of  these  principles  and  data 
without  careful  consideration  of  the  total  system  oar  figuration, 
the  systems  mission  and  operating  environment  can  result  in 
incomplete,  if  not,  misleading  evaluative  conclusions. 

As  a  general  rule  the  human  factors  design  engineer  and 
evaluator  should  be  guided  by  the  principle  that  the  system  is 
being  designed  to  optimize  its  capability  for  executing  specified 
missions.  This  rarely  allows  for  optimizing  the  operator  station 
ideally  to  fit  human  capabilities  and  provide  for  his  comfort. 

Space  and  weight  are  at  a  premium  particularly  in  aircraft  and 
compromises  are  the  rule  in  fitting  components  together  in  order  to 
bring  about  total  system-mission  functioning.  These  ccrprcmises 
are  made  toward  the  end  of  configuring  the  system  to  acocnplish  its 
end  goal  or  purpose.  As  a  consequence ,  the  evaluation  of  any 
ocnponent  within  a  real  system  cannot  be  in  ten*  of  sane  ideal 
conditions  under  which  it  would  operate  best.  Rather  it  must  be 
in  terms  of  the  best  conditions  possible  within  the  reed  constraints 
laid  on  the  system.  Thus  human  engineering  principles  used  as 
evaluative  criteria  by  way  of  a  check-list  or  similar  device  has 
limited  utility  per  se. 

They  have  utility  only  when  used  within  the  context  of  reed  system 
design  constraints  and  system  purpose.  Similarly,  the  extra¬ 
polations  of  handbook  data  must  be  made  within  the  same  context. 

The  use  of  either  in  the  absence  of  knowledge  of  the  overall 
system  ccntraints  and  purpose  can  lead  to  quite  erroneous 
conclusions  as  to  the  adequacy  of  the  system  desi^i. 

Human  engineering  principles  and  handbook  data  are  generally 
used  as  design  and  evaluation  criteria  early  in  system  development. 

They  can  be  used  most  appropriately  by  those  human  factors  engineers 
working  closely  with  equipment  design  engineers  and  with  those  most 
familiar  with  the  system  mission  and  its  constraints.  It  is  very 
difficult  for  anyone  not  oonpletely  familiar  with  the  details  of 
the  mission,  ocrrpcnent  compromises  and  system  constraints  to  apply 
successfully  a  hunan  engineering  check  list  to  the  evaluation  of  an 
operator  station. 
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The  successful  use  of  these  criteria  early  in  development  to 
evaluate  the  operator  station  and  thus,  by  extrapolation,  to  predict 
how  successfully  the  operator  will  perform  in  the  system  is  termed 
"cockpit  oriented"  evaluation  in  this  report.  Here  the  operator 
station  is  examined  with  respect  to  work-place  layout,  design  of 
controls,  functional  placement  of  displays,  control-display 
relationships,  illumination,  anthropometric  measurements,  visibility 
and  the  like.  Assessments  are  then  made  as  to  the  adequacy  of  the 
design  in  meeting  these  "compatibility"  criteria.  Such  an 
evaluation  is  usually  first  carried  out  on  the  cockpit  mock-up  and 
repeated,  although  less  formally,  cn  into  fleet  use.  Unfortunately 
at  this  time,  this  type  of  evaluation  may  be  essentially  the  only 
type  of  human  factors  evaluation  carried  out  for  a  system.  It  is 
the  only  type  that  can  reasonably  be  carried  out,  either  formally 
or  informally,  when  the  evaluator  does  not  have  detailed  infor¬ 
mation  as  to  the  mission  of  the  system  or  the  test  points  at  which 
empirical  performance  data  nay  be  obtained. 

In  one  of  the  systems  examined  by  this  writer  the  yellow  sheets 
turned  in  by  the  "field"  evaluators  of  the  system  contained  over  85% 
cockpit  oriented  items.  That  is  to  say  that  the  evaluators  were 
oonoeming  themselves  with  such  items  as  parallax  in  reading  an 
instrument,  inability  to  reach  a  control  with  harness  locked  and  the 
like  with  no  concern  reflected  in  the  yellow  sheets  for  operator- 
system  capability  for  executing  missions  or  mission  segments.  Of 
course ,  one  explanation  for  such  a  lack  might  be  that  the  operator- 
system  combination  did  function  to  mission  standards.  A  more  likely 
explanation  is  that  the  evaluator  was  not  given  information  as  to  the 
mission  which  would  allow  him  to  evaluate  it  against  mission  criteria  - 
which  happened  to  be  the  case  in  this  instanoe.  Without  such  infor¬ 
mation  and  without  an  orientation  which  would  lead  him  to  view  the  hivnan 
operator  as  a  component  which  must  deliver  up  certain  inputs  to  a  given 
level  of  accuracy  in  a  given  time  the  evaluator  understandably  carries 
on  in  the  tradition  of  the  mock-up  inspection  during  his  field 
evaluation. 

It  is  important  that  all  concerned  with  systems  evaluation  ocme 
to  the  realization  that  human  factors  evaluation  is  more  than  the 
application  of  "good  hunan  engineering  practice"  to  the  cockpit. 

They  must  begin  thinking  of  the  hunan  as  a  functioning  component 
whose  capabilities  for  procession  information,  as  he  is  desifgied 
into  the  system  to  prooess  it,  must  be  evaluated!  Cockpit  oriented 
types  of  evaluation  are  useful  early  in  development  if  carried  out 
by  the  proper  people  and  with  proper  ocnoern  for  equipment  constraints 
and  system  purpose  as  outlined  earlier.  Given  these  considerations 
it  is  difficult  to  see  how  the  mock-up  inspection  procedure  as  it  is 
presently  exercised  could  possibly  produoe  an  evaluation  with  any 
substantial  validity. 


13 


3.7  MEASUREMENTS  AND  CRITERIA 


T\jo  aspects  of  perfonranoe  relate  to  mission  success  and  are 
important  in  human  factors  evaluation.  Mission  success  will  be 
compromised  if  a  component  fails  to  perform  either  to  the  required 
accuracy  or  within  the  required  time. 

3.7.1  Accuracy  Measurements 

The  appropriate  metric  for  use  in  assessing  human  component 
performance  when  he  functions  within  a  chain  of  equipment  components 
must  be  arrived  at  by  examining  the  outputs  from  the  equipment  at 
the  designated  test -points.  Human  component  performance  is  inferred 
from  measurement  of  the  equipment  output  to  which  the  hunan  component 
must  provide  the  input.  This  will  be  the  usual  case  in  field  evaluation 
and  often  those  measures  of  interest  to  the  equipment  engineer 
(especially  during  Board  of  Inspection  and  Survey  (BIS)  Trials)  will 
be  appropriate  for  assessing  human  performance.  This  is  not  to  imply 
that  the  measurement  values  obtained  by  the  equipment  engineer  during 
his  tests  will  necessarily  he  useful  to  the  HF  evaluator.  The 
equipment  engineer  is  often  interested  in  equipment  performance  in 
response  to  standard  or  stylized  inputs  which  may  not  be  fully 
representative  of  the  way  in  which  the  equipment  will  be  used  by  the 
hunan  operator  in  carrying  out  the  evaluation  mission.  Nevertheless, 
the  measurement  parameters  and  measurement  tools  used  by  the  equipment 
engineer  can  prove  highly  useful  to  the  human  factors  evaluator. 

The  responsibility  of  the  HF  evaluator  is  to  select  those 
measurement  parameters  from  the  engineer's  arsenal  which  most 
directly  reflect  the  human  component  performances  required  at  the 
test -points  identified  earlier  and  to  apply  them  while  the  system 
performs  the  evaluation  mission.  The  human  factors  evaluator  will 
need  to  pay  particular  attention  to  those  parameters  which  test 
those  assumptions  about  human  performance  nade  during  systems 
design  which  were  made  on  the  basis  of  little  or  no  data  or 
required  rather  extensive  extrapolations  from  available  data. 

An  hypothetical  example  using  equipment  which  nudges  the  limits 
of  the  state-of-the-art  may  serve  to  make  the  responsibilities  of 
the  human  factors  evaluator  more  clear.  Suppose  that  a  direct 
lift  control  system  were  designed  into  an  attack  aircraft  for  use 
during  carrier  approach  and  landing.  The  equipment  engineer,  in 
testing  the  system  will  be  interested  in  determining  whether  the 
output,  measured  in  altitude  change  in  response  to  prescribed  standard 
inputs,  is  as  he  predicted  when  he  synthesized  the  system.  However, 
since  he  synthesized  the  system  with  a  human  component  in  the  chain 
he  made  certain  assumptions  about  the  characteristics  of  the  human 
component  for  receiving  inputs  and  raking  outputs  within  the 
given  time  constraints.  The  human  factors  evaluator,  using  the 
same  measurement  parameters  and  tools  can  test  these  assumptions 
under  conditions  representative  of  the  operational  mission,  i.e., 
actual  carrier  landings,  ftore  often  than  not  the  human  factors 
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and  the  equipment  engineer  can  work  together  in  combining  their 
tests  to  mutual  benefit  and  improvement.  This  is  particularly 
applicable  during  the  earlier  stages  of  development. 

It  is  particularly  important  that  the  teaming  up  of  human 
factors  and  equipment  engineer  during  design  and  evaluation  be 
appreciated.  It  is  important  to  remember  also  that  whenever  it 
is  possible,  a  third  member  of  the  team  may  be  very  useful. 

This  third  member  is  an  experienoed  operator  of  currently  operational 
systems  which  are  similar  to  the  system  under  test.  This  member 
brings  to  the  evaluation  a  knowledge  of  the  operational  requirements 
and  constraints  to  which  the  system  will  be  subject.  Again  it 
should  be  enphasized  that  it  is  important  that  he  be  assigned  to  the 
team  as  early  in  the  development  sequenoe  as  possible. 

What  can  be  said  about  specific  measurement  parameters  in 
human  factors  evaluation?  It  should  not  be  neaessary  to  warn 
against  selecting  those  measures  which  can  be  convenient ly  obtained 
or  nicely  quantified  unless  they  also  meet  the  criteria  of  importance 
to  mission  success.  At  the  same  time,  the  Hunan  Factors  evaluator 
should  not  be  reticent  about  insisting  cn  particular  measures  or 
special  equipment  in  order  to  assess  the  huran  component.  Every 
component  in  the  system  must  be  evaluated  at  one  time  or  another 
during  the  development  of  the  system.  The  human  component  deserves 
no  less  consideration  than  any  other. 

As  to  specific  measures  these  will  vary  by  system,  at  different 
stages  of  system  development  and  with  the  evaluative  tools  available. 
With  the  system  of  concern  here,  i.e. ,  the  fighter  and/or  attack 
aircraft,  a  frame  of  reference  is  suggested  which  will  help  to 
insure  inclusion  of  the  appropriate  measures.  In  this  frame  of 
reference  three  major  areas  of  concern  are  broken-out.  Those  are 
(1)  aircraft  attitude  control,  or  stabilization  about  the  aircraft 
axes,  (2)  navigation,  or  position  and  translation  in  three 
dimensional  spaoe  and  (3)  system  state,  or  the  ccndiction  of  power 
plant,  armament  subsystem  etc. 

Within  the  first  category  it  should  be  first  kept  in  mind  that 
the  system  is  designed  to  be  a  platform  or  carrier  for  various 
ordnance.  As  such  it  must  attain  certain  attitudes  and/or  be 
stabilized  for  proper  delivery  of  the  weapon.  It  will  have  been 
assumed  during  design  that  the  human  ccnpcnent  can  provide  inputs 
into  the  system  which  will  control  the  attitude  of  the  aircraft 
within  limits  which  will  result  in  weapon  release  with  high 
probability  of  hitting  the  target.  This  assumption  may  have  been 
only  implicit  but,  nevertheless,  was  made  and  must  hold  for  mission 
sucoess.  Thus,  angular  deflections  and  rates  in  pitch,  roll  and 
yaw  will  be  required  to  be  maintained  if  successful  weapon  delivery 
is  to  be  attained.  Therefore  an  assessment  of  the  human  operator's 
performance  in  maintaining  them  within  the  required  limits  is 
necessary. 
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In  sore  instances  the  human  factors  evaluators'  task  could 
be  that  of  synthesizing  data  already  obtained  on  the  system  from 
other  sources.  For  exanple,  if  the  required  limits  within  which 
aircraft  body  axes  deflections  and  rates  must  be  nviintained  for 
weapon  delivery  were  stated  and  if  data  were  available  from  handling 
quality  studies  which  provided  quantitative  data  on  operator 
performance,  an  assessment  could  be  made  of  weapon  delivery 
suooess  with  the  human  operator  in  the  loop.  Here  the  team  effort 
mentioned  earlier  is  essential.  Uhfortvnately  for  this  example 
the  handling  quality  data  is  obtained  using  highly  experienced 
and  well  trained  operators.  Their  performance  data,  therefore, 
could  not  well  be  considered  representative  of  that  of  the  fleet 
pilot  without  substantiating  information. 

With  respect  to  the  seocnd  category,  navigation,  a  broader 
meaning  is  intended  than  that  ordinarily  ascribed  to  the  term. 

Here  the  ocnaem  is  with  control  of  the  position  and/or  rate  of 
change  of  the  vehicle  in  three  dimensional  space.  Therefore, 
control  with  respect  to  these  spaoe  axes  during  carrier  approach 
has  the  same  dimensions  of  measurement  as  during  cruise.  The 
standards  or  limits  within  which  control  must  be  exercised,  however, 
will  differ  by  mission  segment. 

The  third  category,  system  state,  may  be  thought  of  as  the 
condition  of  the  system  at  any  moment  in  time  apart  from  its 
attitude  and  spatial  position.  In  this  category  are  the  requirements 
that  the  power  plant  output,  armament  system  set-up,  etc.  be  in  a 
given  "state"  at  given  times.  It  contains  both  monitoring 
(perceptual)  activities  and  procedural  or  set-up  activities  of  the 
operator.  Of  particular  ocnaem  in  current  attack  aircraft  is  the 
problem  of  the  armament  system  being  set-up  in  the  proper  state 
for  given  conditions  of  operation. 

In  brief,  certain  measures  are  ocrmon  to  each  of  the  above 
categories  with  the  standards  required  varying  with  particular 
segments  of  the  evaluation  mission.  Parenthetically,  in  sore 
instances  it  can  be  inferred  from  satisfactory  operator  performance 
on  a  parameter  during  a  given  mission  segment  that  performance  an 
that  parameter  will  be  satisfactory  during  another  segment.  For 
exanple,  from  satisfactory  performance  under  worst  case  conditions 
satisfactory  performance  under  less  demanding  conditions  (in  terms 
of  standards  by  accuracy)  may  be  inferred. 

The  measurement  parameters  appropriate  to  each  of  the  categories 
are  discussed  here  in  a  general  way  to  provide  an  orientation  for  the 
evaluation  for  aircraft  systems  in  general.  A  natrix  of  specific 
measures  suggested  for  use  with  the  attack  aircraft  is  given  in 
Appendix  A. 
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For  the  first  category  (attitude  control  and  stabilization) 
measures  of  Hie  pitch,  roll  and  yaw  angles  and  rates  of  Hie  aircraft 
nay  be  used  to  infer  the  operator's  ability  to  maintain  these 
parameters  within  required  tolerances.  For  modem  fighter  and 
attack  aircraft,  engineering  knowledge  in  the  areas  of  ocntrol 
theory  and  stabilization  systems  plus  heavy  reliance  upon  the  prior 
training  and  experience  of  the  transiticmnp  pilot  has  resulted  in 
few  apparent  problems  in  this  area  for  the  evaluator.  This  situation 
may  be  more  apparent  than  real. 

The  test  methodology  for  determining  whether  the  aircraft  meets 
certain  performance  specifications  and  exhibits  oertain  handling 
qualities  seems  fairly  well  established.  At  least  in  the  hands 
of  experienced  and  specially  trained  test  pilots  the  performance 
limits  and  handling  qualities  can  be  relatively  well  established. 

The  methodology  for  determining  whether  the  fleet  pilot  can  handle 
the  aircraft  within  limits  while  setting  up  his  armament  panel, 
executing  evasive  tactics  and  performing  various  and  sundry  other 
chores  is  not  established.  Obtaining  the  judgments  of  experienced 
test  pilots  in  evaluating  the  basic  performance  and  handling 
qualities  is  a  necessary  and  important  first  evaluative  step. 

However,  this  step  must  be  considered  as  a  test  of  the  performance 
of  the  hardware  rather  than  an  evaluation  of  the  performance  of  the 
human  ocnpcnent  for  which  the  system  was  ultimately  designed. 

Therefore,  measurements  of  aircraft  attitude  parameters,  when 
the  aircraft  is  flown  by  fleet  pilots  under  conditions  at  least 
approximating  the  weapons  delivery  seprent  of  the  mission,  are 
necessary  to  obtain  evaluative  data  on  the  human  component  of 
interest. 

For  the  second  category  (navigation)  the  argument  just  presented 
holds  also.  When  the  experienced  test  pilot  t?sts  the  system,  that 
test  must  be  considered  essentially  a  test  of  the  hardware.  As  a 
test  pilot  he  tries  to  control  and  standardize  his  inputs  so  as  to 
determine  the  operation  of  the  equipment  he  is  controlling.  Once 
the  adequacy  of  the  equipment  is  established  it  remains  to  be 
determined  whether  the  fleet  pilot  working  with  that  equipment, 
can  produce  the  requisite  performance  to  bring  about  successful 
system  performance.  The  latter  is  the  ocnoem  of  the  hunan  factors 
evaluator. 

In  the  navigation  category,  evaluative  measures  take  as  their 
standard  and  starting  point  the  notion  that  the  aircraft,  at  any 
given  moment  in  time,  must  be  at  a  given  position  in  three 
dimensioned  space  and/or  the  first  and  second  derivatives  of 
position  must  be  within  oertain  tolerances.  That  is  the  operator 
must  function  to  ocntrol  the  position  of  the  aircraft,  its  rates 
and  accelerations  to  be  within  specified  limits.  These  limits  must 
be  determined  for  each  segment  of  the  mission,  e.g. ,  carrier  approach, 
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touchdown,  etc.  During  development  these  criteria  will  neoessarily 
have  been  stated  or  assumed  in  synthesizing  the  system.  Their 
explicit  identification  and  the  establishment  of  test-point 
priority  have  been  discussed  earlier. 

The  third  category,  system  state,  including  as  it  does 
monitoring  and  procedural  activities,  is  highly  important  in  the 
preser  generation  of  fighter  and  attack  aircraft.  The  diversity 
of  orckianoe  capable  of  being  carried  places  a  premium  cn  the 
operator's  ability  to  carry  out  armament  panel  set-up  procedures 
inder  severe  time  constraints.  It  is  in  this  category  that  a 
number  of  decisions  and  procedural  activities  are  likely  to  be 
relegated  to  the  pilot  as  development  progresses  and  new 
capabilities  and  tactics  are  foreseen  far  the  aircraft  as  a  weapons 
system.  The  pilot's  ability  to  carry  out  the  functions  in  this 
area  are  more  generally  indicative  of  his  ability  to  use  the  air* 
craft  in  its  intended  role  as  a  weapon  while  the  first  two 
categories  apply  more  generally  to  his  ability  to  control  it  as 
an  aircraft.  The  human  factor  evaluator  will  need  to  be  concerned 
with  this  category  and  test  points  within  it  will  tend  to  have 
high  priority  for  mission  sucoess. 

3.7.2  Time  as  a  Measure 


In  the  course  of  this  study  it  has  become  apparent  (at  least  to 
this  investigator)  that  an  important  and  usually  critical  parameter 
for  evaluation  is  that  of  time.  In  the  systems  of  ccnaem  in  this 
study  conditions  change  rapidly  and  information  update  rate  is 
often  critical  to  successful  system  performance.  In  actuality  this 
is  true  of  rrany  systems  other  than  fighter  and  attack  aircraft. 
Therefore,  a  critical  question  to  be  asked  in  the  evaluation  is 
whether  or  not  the  human  component  can  execute  the  required 
operations  in  the  prescribed  time. 

The  equipment  engineer  is  likely  not  to  regard  time  as  a  critical 
factor  in  the  sense  that  he  worries  about  his  components  reacting 
too  slowly.  However,  it  is  important  to  recognize  when  man  is 
introduced  into  the  system  that  he  does  not  react  with  the  speed  of 
equipment  components  and  often  may  take  a  critically  long  time  to 
perform  a  given  function.  In  fact,  our  major  concern  with  man  in 
today's  complex  system  may  be  more  with  his  inability  to  perform 
all  of  the  functions  in  time  rather  than  his  inability  to  perform 
accurately. 

In  singling  out  time  as  a  measure  of  performance  it  is  assumed 
that  the  human  operator,  unlike  the  black  box  components,  recognizes  the 
need  for  accurate  performance  in  order  to  insure  his  cwn  safety  and 
the  sucoess  of  the  system  (or  both)  and  strives  to  achieve  it.  Thus, 
time  to  execute  becomes  an  important  standard  against  which  the 
human  component  must  be  evaluated. 
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The  tire  limits  within  which  the  total  system  nust  achieve  a 
mission  or  mission  segment  can  be  relatively  objectively  fixed. 

The  number  and  sequence  of  operations  required  of  the  system 
ocnpcnents  within  this  tire  frame  can  also  be  emme  rated.  Again, 
whether  the  huran  ccnpcnent  executes  his  required  actions  within 
the  tire  limit  may  be  measured  directly  or  inferred  depending 
upcn  the  location  of  the  test  point  within  the  ocupcnent  chain. 

As  with  accuracy  measures,  the  conditions  of  test  and  the  equipment 
characteristics  nust  be  conpletely  described  in  order  that  the 
data  will  be  rexirelly  useful  for  future  designs.  Also,  whenever 
possible,  actual  tire  to  execute  measurements  should  be  taken 
irrespective  and  independent  of  whether  the  operator  performed 
the  function  within  the  required  tire,  again  with  the  object  of 
obtaining  data  useful  to  future  system  desist. 


4.0  LEVELS  OF  EVALUATION 


4.1  THE  ROLE  AND  DEFINITION  OF  FIELD  EVALUATION 

In  any  study  of  the  human  factors  evaluation  problem,  restricting 
the  study  to  "field"  evaluation  creates  a  certain  dilerrma  for  the 
investigator.  First  of  all  he  faoes  the  difficulty  of  defining 
where  field  evaluation  begins  and  where  it  ends.  For  although  there 
are  foraally  defined  stages  in  evaluation,  e.g. ,  BIS  trials,  in 
actuality  evaluation  does  and  must  being  much  earlier  and  extends 
into  operational  use  of  the  system. 

Secondly,  theie  must  be  some  definition  of  what  field  evaluations 
are  to  do.  Broadly  speaking  they  are  meant  to  test  the  system 
against  assumptions  made  about  its  performance  when  it  was  originally 
conceived.  In  a  very  real  sense  any  test  which  predicts  how  well  the 
system  will  perform  is  desirable  at  whatever  point  in  the  development 
process  it  is  conducted.  The  element  which  is  added  to  test  and 
evaluation  through  ocnduct  in  the  field  is  presumably  that  variables 
and  conditions  are  more  nearly  representative  of  those  in  fleet 
operations  and  therefore  more  valid.  This  representativeness  varies 
from  system  to  system  depending  upon  similarity  of  the  system  to 
previous  systems,  urgency  of  system  need,  availability  of  test 
personnel  and  equipment  and  the  like. 

Field  tests  conducted  late  in  or  at  the  end  of  the  development 
period  generally  will  have  greater  " oentent"  validity  than  those 
conducted  earlier.  That  is  to  say  that  the  equipment  to  be  tested 
and  the  conditions  of  test  will  be  judged  by  competent  evaluators 
to  be  good  likenesses  of  the  ultimate  criteria  -  fleet  use  in 
operations.  However,  deficiencies  found  in  the  system  at  this  stage 
are  more  oostly  and  difficult  to  correct.  It  is  a  ccmmcn  feeling 
among  evaluators  that  by  the  time  the  system  reaches  the  field 
e valuation  point  the  system  desipi  is  "set  in  concrete"  and  quite 
resistive  to  change. 

This  inflexibility  is  particularly  injurious  to  the  human 
factors  component  since  the  hunan  operator  is  viewed  in  a  different 
lif£vt  from  hardware  ooirpcnents.  Being  adaptable  it  is  often  felt 
that  he  can  "learn  to  live  with"  a  deficiency  or  it  can  be  corrected 
through  training  or  special  selection  of  the  operator.  A  change  in 
the  system  at  the  field  stage  to  better  accomodate  the  human 
oonponent  has  beer  most  difficult  unless  it  is  declared  an  item 
affecting  the  safety  of  the  operator.  It  is  desirable,  therefore, 
that  valid  evaluation  of  the  human  component  performance  be  carried 
out  as  early  in  the  development  process  as  possible.  (Parenthetically, 
this  is  true  for  any  component) .  The  more  valid  the  oonponent  testing 
early  in  development  the  fewer  problems  that  will  arise  during  field 
testing  or  subsequently  in  fleet  use.  The  most  desirable  situation 
would  be  one  in  which  tests  carried  cut  at  various  stages  of  development 
have  high  predictive  validity  for  predicting  performance  in  the  fleet. 
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This  predictive  validity  in  which  empirical  relationships  between 
tests  and  fleet  performance  are  established  cannot  be  obtained  until 
reliable  measures  of  performance  in  the  fleet  are  possible  -  something 
toward  which  work  should  be  directed  but  not  within  the  purview 
of  this  study. 

However,  at  sane  point  before  actual  fleet  deployment  an 
evaluation  must  be  made  to  determine  whether  the  system  performs 
to  the  original  conception  and  specifications.  The  results  of  this 
evaluation  can  and  mist  serve  as  criteria  against  which  to  validate 
tests  conducted  earlier  in  the  development  process.  Validated  tests 
conducted  earlier  in  the  development  would  reduce  human  factors  system 
deficiencies  at  this  accsept-reject  point  and  deficiencies  could  be 
corrected  at  a  time  when  the  system  design  is  more  amenable  to 
change. 


4.2  EQUIPMENT  EVALUATION  VS.  EQUIPMENT-OPERATOR  INTERFACE  EVALUATION 

The  system  equipment  undergoes  almost  constant  evaluation 
throughout  development  continuing  up  through  Technical  Evaluation. 

In  many  ways  the  equipment  designer  can  test  his  equipment  under 
conditions  which  equal  or  go  beyond  the  constraints  and  conditions 
of  its  operational  use.  IXiring  these  evaluations  variables  can  be 
and  are  rather  carefully  controlled.  When  a  human  operator  is  used 
in  these  equipment  evaluations  he  is  trained  and  instructed  to  make 
carefully  controlled  inputs  to  the  system  and  to  make  prescribed 
observations.  Emphasis  is  on  equipment  functioning  and  test  which 
is  reasonable  and  necessary.  It  should  be  recognized  and  remembered, 
however,  that  these  are  evaluations  of  equipment  and  not  of  operator- 
equipment  interfaces  as  they  would  occur  in  fleet  operations  using 
fleet  operators.  ’The  hunan  operator's  performance  as  he  interacts  with 
the  equipment  is  not  being  evaluated  through  measurement  at  specified 
equipment  test-points.  (These  test  points  have  been  discussed  in 
Section  2.2).  Rather  the  equipment  is  being  evaluated  as  it  functions 
with  an  operator  closing  the  ocntrol  loop  but  an  operator  with  as 
standard  inputs  as  possible  in  order  to  get  a  clear  picture  of 
equipment  functioning.  These  are  Technical  Evaluations  and  have 
been  considered  a  necessary  prerequisite  to  operational  evaluation 
in  which  the  system  is  exercised  more  nearly  as  it  will  be  in 
operations. 

From  these  equipment  oriented  evaluations  sane  information 
relative  to  the  human  oaipcnent  is  obtained.  However,  the  infor¬ 
mation  tends  to  be  oockpit  evaluation  oriented  rather  than  missicri- 
system  oriented.  The  experienced  test  pilots  who  work  with  the 
system  up  through  Technical  Evaluation  can  and  do  rake  observations 
as  to  oockpit  design  deficiencies  and  yellow  sheets  are  generated. 
However,  it  is  difficult  to  get  the  test  pilot  to  think  "mission" 
at  this  stage  when  his  jci>  is  really  to  test  the  assumptions  and 
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specifications  relative  to  the  equipment.  Theoretically,  at  this 
point  the  human  factors  aspects  must  also  be  evaluated  but  it  is 
most  difficult  for  the  test  pilot  to  put  himself  into  the  place  of 
the  operational  pilot  and  to  make  observations  relative  to  the 
adequacy  of  the  operator^equipment  interfaces  in  carrying  out  mission 
functions.  He  can  and  does  make  observations  as  to  such  things  as 
control  accessability,  lack  of  visibility,  difficult  in  carrying  out 
procedures  and  the  like  for  the  mission  segments  which  he  performs. 
That  is  to  say  that,  in  his  Technical  Evaluation,  he  will  perform 
the  mission  segments  ooeircn  to  fighter  and  attack  aircraft  such  as 
launch,  climb  out,  cruise,  approach,  etc.  IXxring  these  evaluations 
he  is  in  a  position  to  observe  deficiencies  both  in  the  equipment 
functioning  and  in  cockpit  design  which  make  his  job  as  an  operator 
difficult  or  less  than  optimum.  However,  the  man-macKihe  adequacy 
in  performing  these  mission  segments  is  assessed  only  with  a  highly 
experienced  operator  in  the  loop. 

4.2.1  Test  Point  Measures 


During  Operational  Evaluations  human  factors  evaluation  takes 
on  more  of  the  mission  oriented  and  less  of  the  cockpit  oriented 
flavor.  The  components  and  subsystems  (ocraponent  chains)  can  be 
tested  for  their  adequacy  in  performing  functions  which  relate 
more  directly  to  ultimate  mission  purpose,  i.e,  delivery  of 
orchanoe.  The  system  synthesis  process  and  the  necessity  for 
component  and  subs* 'stem  evaluation  has  been  discussed  earlier  in 
Section  2.1.  In  testing  the  human  oonponent,  the  use  of  test 
points  downstream  in  the  conponent  chain  fraa  the  human  ccrrponent  has 
also  been  discussed.  Further  it  has  been  suggested  that  these  test 
points  in  the  total  system  be  identified  and  their  relative 
inportanoe  to  mission  success  be  established  so  that  during  total 
system  test,  either  direct  or  inferential  human  performance  data 
can  be  obtained.  At  these  test  points  standards  of  performance 
nust  be  set  within  which  each  conponent  nust  operate  in  order  for 
the  conbinated  conponent  performances  to  sumnate  into  total  system 
sucoess.  These  standards  will  vary  with  the  purpose  (mission)  to 
which  the  system  is  putt  and  by  mission  segment.  The  definition  of 
an  evaluation  mission  has  also  been  reccrrmsnded  in  order  that 
conditions  of  test  may  be  standardized  and  repeated  measures  taken 
under  uniform  conditions.  The  suggested  parameters  for  obtaining 
data  at  these  test  points  are  given  in  Appendix  A. 

4.2.2  Trouble  Shooting 

'  n  subsystems  or  chains  of  components  are  tested  and  they 
do  not  i_ome  up  to  standard  it  is  necessary  to  determine  where  the 
problem  lies.  When  measuring  at  seme  test  point  at  the  end  of  a 
chain  of  components  in  order  to  infer  the  adequacy  of  the  human 
performance,  we  assume  that  the  equipment  is  functioning  properly 
and  that  any  inadequacies  discovered  lie  in  the  human  output. 
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However,  the  focus  of  the  malfunction  within  the  ccnpcnent  chain 
is  not  always  clear  and  cannot  be  reliably  attributed  to  human 
conponent  failure.  At  times  the  test  point  nay  be  at  the  end  of  a 
ccnplex  combination  of  ccnpcnents  such  that  it  is  not  easily  inferred 
as  to  where  the  failure  lies.  In  such  cases  it  is  neoessary  far  the 
operator  to  be  trained  in  and  have  the  neoessary  tools  for  reliable 
retrospection  or  concurrent  observation  of  human  component  problems 
as  they  occur.  These  observations  serve  to  trouble  shoot  the  sub¬ 
system  and  point  to  the  deficiency  in  the  design  which  leads  to 
inadequate  performance  at  the  operator-equipment  interface. 

At  present  this  type  of  information  is  not  obtained  in  any 
systematic  or  formal  manner.  Some  test  pilots  record  their 
observations  incidental  to  their  report  of  their  observations  of 
equipment  functioning.  It  has  been  reported  to  this  investigator 
that  often  an  operator^equipment  interface  problem  is  not  reported 
since  the  pilot  feels  it  to  be  something  that  would  happen  only 
infrequently  and  probably  not  at  all  after  the  operators  become 
familiar  with  the  system.  At  other  times  they  are  not  reported 
because  the  test  pilot  feels  that  a  problem  which  has  arisen 
reflects  upon  his  own  ability  as  a  test  pilot  or  is  not  typical  of 
his  performance  so  he  tends  to  overlook  reporting  it. 

A  formal  list  or  guide  is  needed  to  systematize  the  pilot's 
reports  with  respect  to  deficiencies  which  are  not  possible  of  being 
detected  from  measurements  of  the  subsystem  performance  in  order 
that  these  problems  can  become  a  matter  of  record  and  a  part  of 
the  formal  evaluation.  These  reports  should  be  accompanied  by  a 
complete  description  of  the  conditions  under  which  the  deficiency 
occurred,  the  equipment  from  which  the  operator  received  his  inputs 
and  to  which  he  made  outputs  and  his  level  of  experience  and 
familiarity  with  the  system.  This  type  of  data  can  then  be 
accumulated  as  a  data  bank  for  guidance  in  the  design  of  future 
systems  is  well  as  guide  correct  ion  of  the  deficiency. 

4.2.3  Operator  Adaptation 

The  test  pilots  must  be  especially  indoctrinated  to  the  problem 
of  evaluation  which  arises  when  the  subsystem  being  evaluated 
functions  properly  but  the  hinan  operator  has  had  to  perform  in  an 
adaptive  way  or  to  recover  from  a  mistake  in  order  to  make  it 
operate  properly.  If  measurements  are  being  taken  at  a  test  point 
and  the  subsystem  operates  to  standard  there  is  no  way  of  detecting 
performances  in  which  the  pilot  adapted  or  recovered  from  errors  and 
thus  caused  the  subsystem  to  function  correctly.  Therefore  a  formal 
and  systematic  method  is  neoessary  such  that  the  test  pilot  will 
report  in  detail  such  behavior  along  with  the  conditions  of  the 
equipment  and  his  level  of  experience  and  familiarity  with  the  system. 
These  reports  may  lead  either  to  a  redesign  of  the  system  in  order 
that  such  adaptive  behavior  is  not  neoessary  or  may  be  incorporated 
into  training  programs  or  procedural  changes. 
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Indoctrination  and  training  of  test  personnel  is  discussed 
in  Section  5.5  and  a  suggested  course  outline  is  given  in  Appendix  B. 

4.3  VALIDATION  AND  USE  OF  EAKiJER  TESTING 

It  has  been  pointed  out  earlier  that  discovering  man-machine 
interface  deficiencies  at  the  Techeval  or  Opeval  level  makes  them 
difficult  to  remedy  since  the  system  is  so  far  along  in  the 
development  prooess.  It  has  also  been  stated  that  the  Techeval  and 
Opeval  must  serve  as  a  substitute  for  the  ultimate  criteria  of 
fleet  operations  (at  least  at  the  moment)  as  well  as  serving  the 
function  of  testing  the  assumptions  made  during  the  ocnoepticn  and 
development  prooess.  In  this  intermediate  criteria  role  the 
Techeval  and  Opeval  can  serve  as  criteria  against  which  to  validate 
testing  carried  out  earlier  in  the  development  prooess.  The  earlier 
valid  testing  can  be  carried  out  the  greater  the  opportunity  to 
influence  design.  These  earlier  tests  take  a  variety  of  forms. 

4.3.1  Mathematical  Models 

The  high  speed  digital  oomputer  has  opened  the  possibility  of 
using  mathematical  models  to  manipulate  system  variables  and 
determine  the  effect  upon  total  system  operation.  In  their  use 
mathematical  models  are  set  up  to  simulate  parameters  of  the  system 
so  that  values  of  the  parameters  can  be  varied  and  interactions 
among  parameters  con  operate. 

These  models  can  be  categorized  as  falling  into  the  classes  of 
(1)  "time  available"  models,  (2)  reliability  or  probability  of 
failure  models,  and  (3)  error  and  importance  of  error  models.  Their 
limiting  feature  at  the  present  is  the  paucity  of  relevant  data. 

Their  appeal  lies  in  their  ability  to  handle  a  large  number  of 
parameters  and  their  interactions  and  to  very  quickly  determine  how 
changes  in  these  parameters  will  affect  the  system  operation. 

Their  potential  contribution  to  testing  early  in  development  is 
great,  however,  at  the  moment  no  impressive  data  exists  as  to 
their  validity. 

No  re ooirmendat ions  are  made  in  this  report  with  respect  to 
any  particular  model.  The  contribution  of  the  present  work  to  the 
model  approach  can  be  greatest  in  setting  up  of  methods  and  techniques 
for  establishing  test  points  and  data  gathering  and  insisting  that 
the  conditions  of  test  be  fully  described  so  as  to  make  the  data 
naximally  useful  in  modeling  future  systems. 

4.3.2  Physical  Models 

Physical  models  of  the  ultimate  system  range  from  simple  static 
mock-ups  of  a  part  of  the  system  through  dynamic  simulators  to  the 
aircraft  itself.  This  section  discusses  these  models  and  their  role 
in  human  factors  evaluation. 
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4. 3.2.1  Static  Mock-Up  -  At  present  the  physical  model  most 
used  in  hunan  factors  design  and  evaluation  is  the  cockpit  mock-up. 
Since  the  mock-up  is  both  well  used  and  mis-used  it  deserves  special 
discussicn. 

The  mock-up  is  used  intensively  and  almost  continuously  by 
design  personnel  as  an  evaluative  tool  during  the  design  process. 

In  this  role  it  is  both  useful  and  necessary.  However,  the  static 
mock-up  is  the  focal  point  for  formal  hunan  factors  evaluation  which, 
if  not  outmoded,  needs  improvement .  The  successful  evolution  and 
validation  of  nathematical  models  implemented  an  the  high  speed 
oonputer  nay  eventually  replaoe  and  go  much  beyond  many  of  the 
functions  now  fulfilled  by  the  static  mock-up.  In  the  meantime, 
it  is  believed  by  this  investigator  that  major  improvements  can 
be  made  in  the  human  factors  evaluation  procedures  and  techniques 
which  are  used  during  mock-up  inspections  as  they  are  now  constituted. 
This  belief  has  been  found  to  be  almost  universally  supported  by 
these  having  experience  with  the  mock-up  inspection. 

As  a  background  for  discussing  the  mock-up  inspection  and  the 
role  it  plays  in  overall  evaluation  it  is  necessary  to  discuss  two 
levels  of  evaluation.  The  first  level  is  that  which  was  indicated  in 
an  earlier  section  as  being  "cockpit  oriented"  e valuation.  The 
second  was  referred  to  as  "mission-oriented"  evaluation. 

At  the  first  level  of  evaluation  the  cockpit  mock-up  is  examined 
for  what  may  be  termed  compatibility  with  the  operator's  capabilities 
aid  limitations.  The  operator  station  is  examined  in  the  light  of 
human  engineering  design  criteria  such  as  conformity  with  good  human 
engineering  principles  and  with  handbook  data.  This  means  evaluating 
on  the  basis  of  work-place  layout,  control  coding,  control-display 
relationships,  illunination,  anthropometric  compatibility  and  the 
like.  Assessments  are  made  as  to  the  adequacy  of  the  design  for 
meeting  these  design  principles  «nd  handbook  data  criteria.  The 
conscientious  evaluator  will  also  evaluate  on  the  basis  of  functional 
arrangement  of  displays  and  controls.  This  type  of  evaluation  is 
cockpit  oriented  evaluation. 

To  carry  out  a  human  factors  evaluation  of  this  type  there  are 
certain  requirements  with  respect  to  personnel,  method  and  techniques. 
Personnel  must  be  knowledgeable  with  respect  to  the  human  factors 
literature  and  data,  be  able  to  critically  evaljate  that  data,  and 
extrapolate  from  it  in  the  light  of  the  particular  system  being 
evaluated.  Further,  they  must  be  cognizant  of  the  principles  of 
good  human  factors  design  insofar  as  these  are  known  and  can  be 
applied  to  the  specific  system  being  evaluated.  The  areas  and 
points  to  be  covered  during  the  evaluation  should  be  incorporated 
into  a  checklist  such  that  a  systematic  examination  of  the  operator 
station  nay  be  rrade  and  insure  that  important  areas  are  not  overlooked. 
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It  is  particularly  important  that  the  human  factors  evaluator 
be  knowledgeable  with  respect  to  the  personal  equipment  to  be  worn 
by  the  operator  and  that  evaluations,  particularly  with  respect  to 
work  space  layout  and  anthropometries ,  be  conducted  taking  into 
account  the  effect  of  personal  equipment  upon  performance.  This 
latter  point  seers  minor  but  nevertheless  is  sometimes  overlooked 
and  aan  result  in  major  problems. 

No  disparagement  of  this  type  of  evaluation  is  intended  if 
carried  out  properly  by  oorpetent  and  informed  personnel.  Indeed, 
it  is  a  necessary  antecedent  to  evaluations  of  the  second  type. 

The  second  level  or  mission  oriented  evaluation  requires  an 
examination  of  the  operator  station  in  the  light  of  operator  functions 
and  tasks  and  an  assessment  of  whether  the  system  will  do  what  it 
was  designed  to  do  to  some  specified  criterion  level.  In  order 
to  carry  out  such  an  evaluation  certain  data  and  information  must 
be  known  over  and  above  that  required  for  carrying  out  a  cockpit 
oriented  evaluation.  It  is  necessary  to  know  in  detail  what  functions 
and  tasks  the  operator  is  required  to  perform,  in  what  sequence , 
and  to  what  criteria.  These  criteria  are  phrased  in  terms  of 
performance  standards,  sequence  of  operation  and  time.  That  is  to 
say  that  the  determination  must  be  made  as  to  whether  the  operator 
can  carry  out  his  functions  and  tasks  in  the  proper  sequence  to  the 
required  accuracy  within  the  required  time.  ““ 

Any  mission  oriented  type  of  evaluation  requires  the  generation 
of  very  specific  and  detailed  data  regarding  the  system.  To  carry 
out  such  an  evaluation  the  evaluator  must  have  specific  and  detailed 
information  about  subsystem  functions,  data  flow,  and  the  requirements 
placed  upon  the  human  operator  with  respect  to  sequencies ,  accuracies 
and  time  for  carrying  out  his  operations.  He  must  have  this  infor¬ 
mation  as  it  applies  to  the  various  segments  of  the  system  mission 
and  certainly  for  the  evaluation  mission  if  one  has  been  generated. 

The  static  nock- up  as  a  physical  model  serves  very  well  as  a 
tool  in  helping  to  make  decisions  and  evaluations  about  physical 
arrangements  of  the  cockpit.  It  is  not  an  adequate  tool  for  the 
mission  oriented  type  of  evaluation  in  which  assessments  are  made 
of  whether  or  not  the  operator  will  be  able  to  perform  to  the 
standards  required  by  the  mission  -  a  more  critical  reauirement. 

It  is  relevant  to  raise  the  question  as  to  the  advisability  of 
a  mock-up  inspection  in  which  the  static  mock-up  is  evaluated  and  a 
binding  decision  made  as  to  its  configuration.  The  advisability 
is  questioned  because  of  difficulties  with  it  experienced  by  this 
investigator  and  echoed  almost  universally  by  others  with  mock-un 
inspection  experience.  The  first  difficulty  is  that  dynamic  man- 
machine  interactive,  performance  cannot  be  evaluated.  This,  of 
course ,  the  static  mock-up  evaluation  has  trade  no  pretense  at  doing. 
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However,  this  performance  relative  to  the  required  standard  is  the 
central  question  particularly  when  it  is  recalled  that  the  operator 
station  must  be  configured  to  optimize  total  system  functioning  for 
performing  a  mission  rather  than  optimized  to  conform  to  the 
operators  capabilities  and  limitations. 

Before  the  advent  of  such  complex  systems  as  we  now  have  with 
their  wide  variety  and  sophistication  of  armament  the  mock-up 
inspection  was  a  natural  procedure  and  served  a  useful  purpose. 

In  simpler  systems  whose  effectiveness  and  enploynmt  were  much  more 
dependent  upon  the  ingenuity  of  the  operator  the  oockpit  layout  for 
operator  convenience  was  undoubtedly  of  major  importance.  In  the 
complex  systems  of  today  the  operator  is  required  to  perform  more  a 
dynamic  "ccrrpcnent  in  the  system"  role  and  functions  to  process  infor¬ 
mation  in  a  standard  way  toward  the  purpose  of  bringing  about  a 
specific  end  product  through  the  employment  of  specially  designed 
equipment.  The  static  mock-up  does  not  lend  itself  to  such  an 
evaluation  and  in  a  real  sense  is  vestigial  having  not  only  outlived 
its  usefulness  but  prevents,  by  the  inertia  of  its  use,  the  adoption 
of  more  effective  procedures. 

The  second  difficulty  is  that  in  order  to  be  useful  cockpit 
oriented  evaluations,  the  mock-up  inspection  team  personnel  must 
have  the  qualifications  outlined  earlier.  That  is,  they  must  be 
oognizsnt  of  the  principles  of  good  human  engineering  insofar  as 
these  are  know*  and  can  be  applied  to  the  particular  system  being 
evaluated.  This  is  too  often  not  the  case. 

Mock-up  inspections  have  been  carried  out  by  teams  composed  of 
personnel  representing  the  various  specialities  concerned  with  the 
system.  Judgments  and  evaluations  as  to  the  adequacy  of  the  hunan 
factors  design  generally  are  made  by  any  member  of  the  team.  Often 
these  judgments  reflect  the  particular  experience  and  preferences 
of  the  individual  rather  than  the  considered  application  of  known 
principles. 

A  third  difficulty  is  that  in  order  to  even  approximate  an 
evaluation,  which  is  mission  oriented  during  a  static  mock-up 
inspection  the  team  member  must  have  important  additional  information. 
In  addition  to  knowledge  of  general  human  factors  principles  and  a 
checklist  to  determine  systematically  the  adequacy  of  design  from  the 
oockpit  orientation  point  of  view,  he  must  have  detailed  information 
with  respect  to  specific  tasks  to  be  carried  out  by  the  operator  as 
he  functions  as  a  component  within  the  system  in  performing  the 
mission.  He  most  have  this  information  as  it  applies  both  to  normal 
and  "worst  case"  operation  of  the  system  preferably  within  the 
context  of  a  well  detailed  evaluation  mission.  He  must  also  have 
intimate  knowledge  of  the  equipment  with  which  the  operator  interacts 
to  include  detailed  information  on  data  flow,  flow  rates  and  required 
outputs  from  the  operator.  He  must  then  project  himself  into  the 
role  of  the  operator  functioning  to  handle  these  data  to  bring 
about  mission  accomplishment  -  a  procedure  not  likely  to  produce  high 
predictive  validity. 
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It  is  virtually  inpossible  for  m  inspection  team  member  to 
have  such  familiarity  with  the  system  when  he  comes  to  the  mock-up 
inspection.  Nor  is  it  possible  for  him  to  acquire  a  sufficient 
appreciation  of  it  during  the  inspection  time  period.  The  net  result 
is  that  the  inspection  team  members  nake  human  factor  evaluative 
judgments  based  upon  limited  knowledge  of  the  intricacies  of  the 
equipment  and  details  of  the  mission.  Their  judgments  therefore 
tend  to  stem  from  their  own  personal  likes  and  experiences.  They 
nust  rely  upon  the  orientation  and  guidance  from  contractor  human 
factors  personnel  who  are  familiar  with  the  system  from  their  work 
with  it  and  upon  available  hunan  factors  analyses  data.  The  short 
period  of  a  mock-up  inspection  is  not  sufficient  to  gain  a  comprehensive 
picture  of  the  system  and  its  requirements. 

A  third  alternative  is  recommended  which  would  prove  feasible 
with  mo6t  systems.  Sinoe  great  familiarity  with  the  system  equipment 
and  mission  is  required  the  assignment  of  customer  personnel  to  the 
contractors  facility  during  development  is  suggested.  These  personnel 
would  be  experienced  operators  of  the  prior  system  and  would  provide 
expertise  in  operational  problems  and  usage  of  the  system.  In  turn 
they  would  learn  the  details  of  the  equipment  with  which  the  operator 
mist  work  arid  what  the  system  was  designed  to  do.  Working  together 
with  contractor  personnel  the  oockpit  design  would  be  evolved  to 
be  "frozen"  at  a  designated  point  during  the  development  process. 

These  customer  personnel  would  represent  the  human  factors  area 
during  the  mock-up  inspection  and  be  members  of  the  inspection  team. 

The  requisite  qualifications  of  these  personnel  are  discussed  in 
greater  detail  in  Section  5.3. 

4. 3. 2. 2  Simulators  -  The  simulator  as  a  dynamic  physical  model 

of  the  cockpit  holds  many  advantages  over  the  static  mock-up.  From 
the  evaluators'  point  of  view  it  offers  flexibility,  opportunity  to 
obtain  reliable  performance  data  and  a  test  situation  more 
representative  of  the  real  system. 

It  offers  an  opportunity  to  provide  data  to  the  later  "field" 
evaluations  which  may  lighten  their  load  and  even  go  beyond  what 
is  possible  in  the  aircraft  itself.  The  use  of  such  data  is 
encouraged  by  Board  of  Inspection  and  Survey  Aircraft  Test  Directive 
No.  1-6,  17  September  1965  as  follows. 

3.  Policy.  It  is  the  policy  of  the  Board  to  acoept  for  trials 
purposes  data  from  any  source  provided  that  in  the  judgment 
of  the  activity  conducting  the  trials  the  data  are  valid 
and  are  fully  representative  of  the  production  article 
undergoing  trials. 

It  has  been  reported  to  this  investigator  that  the  simulator- 
cannot  be  built  in  time  to  be  useful  for  most  systems.  This  dis¬ 
advantage  appears  to  be  lessening  as  designers  become  more  knowledgeable 
about  simulator  requirements  and  more  of  them  come  into  existence. 

More  and  more  they  are  ooming  to  be  used  to  solve  special  design 
problems  and  to  evaluate  decisions  about  particular  subsystems. 
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The  Heads  up  Display  (HUD)  is  a  case  in  point.  It  is  also  pointed 
out  that  simulation  is  mud*  more  expensive.  However,  it  is  also 
acknowledged  that  it  takes  only  a  few  ECP's  or  aircraft  accidents 
to  pay  for  the  simulator  cost. 

An  additional  advantage  of  using  a  simulator  during  design  is 
the  knowledge  to  be  gained  about  training  problems,  training 
procedures  and  training  devioe  configuration.  It  is  difficult  to 
estimate  the  time  and  money  that  this  saves.  Parenthetically,  the 
experienced  customer  personnel  assigned  to  the  contractors  facility 
(recommended  in  Section  4. 3. 2.1)  can  also  contribute  greatly  to 
training  plans  and  training  equipment  specifications  if  they  are 
given  the  opportunity  to  participate  in  design  and  evaluation  using 
a  simulator. 

The  simulator  is  demonstrating  its  effectiveness  more  and 
more  during  design  and  evaluation.  Its  use  during  mock-up 
inspection  as  those  inspections  are  now  conducted  would  give  the 
inspection  team  members  greater  insight  into  the  human  factors 
design.  However,  its  effectiveness  would  be  much  greater  if  used 
cn  a  continuing  basis  by  contractor  and  customer  personnel  to  arrive 
at  a  suitable  configuration.  The  simulator  to  be  most  effectively 
used  requires  personnel  trained  in  the  conduct  of  evaluation  using 
such  tools.  The  customer  personnel  oould  contribute  greatly  to 
identifying  the  important  parameters,  conditions  and  constraints  to 
be  considered  during  any  test. 

Simulator  tests  also  requires  the  collection  of  data  over 
some  period  of  time  using  representative  operators.  The  data  requires 
reduction  and  analysis  which,  while  becoming  less  and  less  cf  a 
burden  with  modem  computer  technology,  nevertheless  does  not  lend 
itself  to  the  "spot  check"  type  of  evaluation  characteristic  of  a 
mock-up  inspection. 

4.3.3  Part  Task  Testing 

The  initiation  of  human  factors  evaluation  from  the  beginning 
of  the  system  development  with  the  assignment  of  manufacturer  and 
customer  personnel  to  conduct  and/or  monitor  these  evaluations  allows 
for  the  introduction  of  other  evaluative  techniques.  These 
techniques  are  those  which  we  have  termed  "open  loop"  tests  of 
display  design  configuration  and  which  may  be  used  in  weeding  out 
or  narrowing  down  design  alternatives  in  early  stages  of  development. 
These  techniques  are  essentially  tachistoscopic  presentations  of 
display  designs  used  for  comparative  evaluation.  They  are,  in  fact, 
the  techniques  which  were  used  largely  in  the  laboratory  setting, 
in  obtaining  a  major  portion  of  the  human  factors  data  available  to 
us  today. 
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The  advantage  in  using  such  techniques  as  a  part  of  the 
development  of  a  system  over  its  use  in  the  laboratory  is  that  it 
would  be  used  by  those  familiar  with  th»  problems,  variables  and 
constraints  peculiar  to  the  system  of  interest  and  would  generate 
data  applicable  to  system  needs.  The  technique  has  been  tried  in 
a  limited  way  in  connection  with  system  development.  It's  point  of 
entry  into  the  evaluation  process  is  shown  in  Figure  2. 

The  open-loop  testing  technique  is  termed  "open-loop"  since 
the  response  of  the  subject  has  no  direct  effect  upon  the  next 
stimulus  presentation.  The  stimulus  material  nay  be  a  display 
configuration  presenting  information  which  the  subject  is  required 
to  interpret,  readout,  and  report.  Measures  of  the  subjects 
performance  nay  be  the  speed  and  accuracy  with  which  the  inforrration 
is  readout.  The  stimulus  material  may  be  used  in  the  form  of  a 
static  projection,  i.e.,  a  35  mm  slide,  in  such  case  it  is  termed 
static  open-loop  testing.  The  material  ray  be  presented  through 
use  of  motion  picture  film  in  which  display  elements  move  realistically. 
This  type  of  testing  is  then  termed  "dynamic  open-loop"  testing. 

A  detailed  description  of  this  type  of  testing  is  given  in  Schum, 

Elam  and  Matheny  (1962)  in  which  its  use  has  been  demonstrated  in 
connection  with  the  study  of  design  of  altimeter  is  poor  compared  with 
other  designs. 

The  point  to  be  made  is  that  human  factors  evaluation  should 
begin  at  the  point  of  assignment  of  functions  in  the  development 
process  and  continue  through  to  use  in  the  fleet.  During  this 
evaluation  process  at  least  five  methods  of  evaluation  are  applicable. 
There  are  (1)  human  engineering  check  lists,  (2)  static  and  dynamic 
open- loop  tests,  (3)  mock-up  evaluations,  (4)  tests  in  the  simulator 
of  the  system  and  (5)  tests  in  the  actual  system.  A  suggested 
relationship  of  these  types  of  testing  throughout  the  development 
and  employment  of  a  system  is  given  in  Figure  3. 

Human  engineering  check  lists  are  particularly  appropriate  to 
early  stages  of  hurvan  engineering  evaluation  and  can  be  used  with  the 
mock-up  for  type  one  evaluations.  A  listing  of  check  lists  felt  to 
be  representative  of  those  in  use  as  given  in  Section  8.0,  Source 
Material.  Open- loop  testing  lends  itself  to  ccmparat ive  evaluations 
of  components  and  subsystems.  The  simulator  and  aircraft  are  most 
effective  in  evaluating  whether  or  not  hunan  performance  meets 
specific  criteria  for  system  effectiveness. 
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Extent  of 


FIGURE  3 


RELATIONSHIP  AND  EXTENT  Or  USE  Of  EVAIiJATIOM  PPOCEDUPES 
TTiFOUGHOUT  TIE  SVSTC1  LCVEIjOPIOIT  PPOCESS 
(SUGGESTED  TOP  STUDY) 


1.  Human  engineering  check-lists  used  to  evaluate  design  compatibility 
with  human  capabilities  and  limitations. 

2.  Open-loop  tests  used  for  comparative  evaluations  of  display  and  con¬ 
figuration  design. 

3.  Mock-up  used  in  conjunction  with  human  engineering  check-lists  to 
determine  design  compatibility  with  human  capabilities  and  limitations. 

U.  Simulator  used  in  development  phase  to  evaluate  performance  against 

system,  subsystem  and  component  performance  criteria  and  to  train  Naval 
personnel.  In  BIS  and  Fleet  Use  it  is  used  as  in  development  plus: 
o  Evaluate  proposed  new  or  modified  tactics 
o  i* **ccident  investigation 

o  Diagnosis  of  performance  which  is  lx?lcw  criterion  requirements 

S.  Actual  system  used  to  evaluate  performance  against  system  performance 
criteria  and  to  evaluate  proposed  new  or  modified  tactics. 

*  NPE  - 


**  BIS 


Navy  Preliminary  Evaluation 
Board  of  Inspection  &  Survey 


Under  this  conceptualization  the  mock-up  inspection  as  such 
takes  on  a  different  meaning,  ttvier  it  the  mock-up  of  the  operator 
station  as  a  design  and  evaluation  tool  is  used  on  a  continuing 
basis  by  manufacturer  personnel  inder  continuing  monitorship  of 
user  personnel.  The  significance  of  and  necessity  for  nock-up 
inspections  are  considerably  reduced.  As  indicated  earlier  a 
greater  stability  in  user  monitoring  personnel  would  Is  required 
and  a  closer  and  continual  interaction  effected  between  manufacturer 
and  user  evaluation  personnel. 
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5.0  PERSONNEL  AND  ORGANIZATIONAL  CONSIDERATIONS 


5.1  DOCUMENTATION 

In  the  recent  past  the  requirements  for  consideration  of  hunan 
factors  in  the  design  and  evaluation  of  Navy  systems  has  beoome  more 
explicitly  docunented.  This  cones  about  through  adoption  of 
MIL-H-46855 ,  Hunan  Engineering  Requirements  for  Military  Systems, 
Equipment  and  Facilities,  16  February  1968  and  MIL-STD-1472,  Human 
Engineering  Design  Criteria  for  Military  Systems,  Equipment  and 
Facilities,  9  February  1968.  These  two  documents  when  cited  in  a 
contract  specification  provide  authority  for  carrying  out  effective 
human  factors  effort.  Documentary  authority  without  adequate 
supporting  personnel  to  implement  it  at  the  managerial  and  working 
levels  dilutes  its  effectiveness.  The  documents  contain  the 
authority  for  aggressive  and  definitive  human  factors  work  during 
RDTfcE  when  enough  hunan  factors  personnel  are  available.  At  the 
same  time  the  documents  themselves  cannot  spell  out  the  requirements 
in  the  detail  which  ensures  accomplishment  of  a  good  human  factors 
effort  in  the  absence  of  trained  personnel  in  sufficient  numbers 
dedicated  to  making  their  contribution  felt. 

5.2  MANAGEMENT 

For  human  factors  evaluation,  assignment  of  the  right  personnel 
begins  at  the  project  office  with  responsibility  vested  in  a 
designated  individual  for  the  evaluation  function.  It  should  be 
his  responsibility  to  see  that  the  system  and  time  line  analyses 
contain  the  information  on  test  points  and  their  priorities 
(see  Section  3.2).  These  test  points  should  be  within  the  context 
of  an  evaluation  mission.  This  mission  set  forth  early  in 
development  will  evolve  in  detail  as  the  equipment  configuration 
becomes  firmed  up.  It  is  the  framework  upon  which  both  Hunan 
Engineering  design  and  evaluation  will  hang.  The  human  factors 
engineer  in  the  Project  Offioe  must  insure  that  this  mission  and 
the  evaluation  test  points  are  developed. 

5.3  ASSIGNMENT  OF  NAVY  PERSONNEL  TO  CONTRACTOR  FACILITIES 

The  assignment  of  customer  personnel  to  the  hunan  factors 
effort  at  the  contractors  facility  has  already  been  suggested. 

It  is  suggested  that,  optimally,  these  personnel  be  graduates  of  the 
Navy  Test  Pilot  School  with  special  additional  training  in  hunan 
factors  and  recent  experience  in  systems  similar  to  the  one  being 
evaluated.  The  inport  an  ce  of  the  hunan  conpcnent  to  the  system 
and  the  scarcity  of  hard  data  on  his  performance  in  such  systems 
warrants  giving  especial  attention  to  the  qualifications  of  the 
design  and  evaluation  personnel  who  deal  with  him. 
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These  Naval  personnel  should  be  selected  for  their  interest 
in  and  any  special  qualifications  for  human  factors  work.  Paralleling 
the  practice  of  test  pilots  having  the  additional  qualifications 
of  holding  aeronautical  or  other  engineering  degrees  the  human 
factors  specialty  should  require  special  qualifications  in  human 
engineering.  A  special  course  of  three  to  four  weeks  minimum  should 
be  nade  available  for  this  specialty  coming  after  the  individual 
has  qualified  as  a  Naval  Test  Pilot.  This  special  training  course 
is  discussed  further  in  Section  S.S  and  a  suggested  outline  is 
given  in  Appendix  B.  This  course  is  suggested  as  a  minimum  requisite 
and  not  a  substitute  for  formal  University  training  in  Human  Factors. 
The  responsibilities  of  these  personnel  at  the  contractors  facility 
would  be  purely  advisory  in  helping  to  develop  the  evaluation 
mission  and  the  critical  test  points.  They  would  beocme  members  of 
the  evaluation  team  during  mock-up  inspection  lending  their 
detailed  knowledge  of  the  operator  requirements  and  the  data  flow 
through  the  system  to  a  more  objective  and  mission  oriented  evaluation. 

5.4  CONTRACTOR  ASSIST  DURING  NAVY  EVALUATIONS 

During  evaluations  carried  out  at  the  Navy's  facilities  it  is 
recomnended  that  contractor  human  factors  personnel  assist  and 
advise  on-site  in  the  planning  and  conduct  of  the  tests.  These 
contractor  personnel  contribute  their  knowledge  of  the  system  to  the 
planning  and  conduct  of  the  tests.  Their  detailed  knowledge  will  help 
the  Navy  evaluator  immeasureably  in  working  out  the  details  of  how  to 
test  at  critical  test  points  and  in  diagnosing  sources  of  operator 
difficulties  when  his  performance  cannot  be  directly  measured  or 
clearly  inferred. 

5.5  TRAINING  OF  EVALUATION  PERSONNEL 

The  approach  to  be  taken  by  the  evaluator  in  any  evaluation  is 
essentially  that  which  the  serious  experimenter  would  take  in  testing 
an  hypothesis.  He  must  be  as  knowledgeable  and  have  as  much 
quantitative  information  as  possible  about  the  variables  and  conditions 
influencing  the  operator  behavior.  He  must  either  control  these  or 
be  able  to  assess  their  effects.  He  must  also  have  an  understanding 
of  experimental  design ,  of  reliability  of  measurement  and  of  data 
analysis  and  report. 

An  appreciation  of  these  requirements  ooupled  with  experience 
and  knowledge  of  the  operational  conditions  under  which  the  system 
will  function  would  combine  to  maximinize  the  effectiveness  of  the 
human  factors  evaluator.  Their  combination  in  a  single  individual 
is  rare.  A  training  program  designed  to  produce  such  a  combination 
would  be  quite  beneficial. 

The  human  factors  engineer  usually  comes  to  the  evaluation 
situation  with  a  limited  knowledge  and  appreciation  of  the  operational 
derands.  The  Navy  project  personnel  assigned  to  evaluate  the  system 
have  the  operational  experience  but  usually  are  not  experienced  in 
the  methods  of  experimentation  which  should  be  applied.  A  cross 
training  program  is  recomnended. 
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It  is  suggested  that  Navy  project  personnel  could  become 
oriented  and  sufficiently  knowledgeable  about  e^rixnental  methods 
through  an  indoctrination  course  of  three  to  four  weeks  mininun. 
This  course  would  be  offered  to  project  officers  and  project  pilots 
who  are  directly  concerned  with  planning  and  conducting  the 
evaluation.  A  suggested  outline  for  this  course  is  contained  in 
Appendix  B. 

Human  factors  personnel  should  be  indoctrinated  in  the 
operational  use  of  like  systems  in  every  way  possible.  Cdr.  Wherry 
has  suggested  that  short  tours  aboard  carriers  by  human  factors 
personnel  should  be  undertaken.  Every  opportunity  for  these 
personnel  to  observe  the  operation  of  similar  systems  either  in 
the  operational  theater  or  more  routine  operation  should  be  taken. 
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6.0  CONCLUSIONS  AND  RECOMMDLDATIONS 


After  reviewing  relevant  Navy  documents  and  pertinent  reports 
in  the  general  literature,  holding  interviews  and  discussions 
with  those  experienced  with  and  involved  in  design  and  evaluation, 
studying  the  design  and  evaluation  process  of  one  system  in  detail 
with  a  less  detailed  look  at  two  other  aircraft  systems,  and  drawing 
upon  first  hand  experience  with  the  design  and  evaluation  problem 
of  a  najor  aircraft  system  the  conclusions  and  reocomendat ions 
enumerated  below  have  been  drawn. 

These  re ccrmendat  ions  stem  from  the  philosophy  that  changes 
in  the  hunan  factors  evaluation  procedure  rrust  be  evolutionary 
rather  than  revolutionary.  They  mist  be  evolutionary  primarily 
because  the  required  number  of  qualified  people  are  not  available 
to  institute  mandated  revolutionary  changes.  Rather,  people  with 
the  necessary  orientation  and  qualifications  must  be  integrated 
into  the  evaluation  process  and  test  organizations  to  evolve 
procedures  and  to  develop  facilities  and  personnel.  The  recommendations 
given  here  are  intended  to  provide  procedures,  information  and 
tools  to  support  those  individuals  as  they  work  directly  with  the 
evaluation  of  a  system.  They  are  not  intended  as  "band-aids" 
designed  as  a  temporary  fix  on  the  human  factors  evaluation  process. 
Rather  they  point  to  fundamental  needs.  Seme  are  innovative 
while  sane  require  doing  more  formally  and  systematically  what 
is  new  done  informally  and  even  sporadically. 


6.1  EVALUATION  MISSION 

A  system  mission  should  be  described  and  adopted  as  the  evaluation 
mission.  This  mission  should  be  built  around  the  primary  weapons 
delivery  mode  as  the  system  will  operate  in  its  most  likely  theater 
of  operation  and  incorporate  "worst  case"  conditions.  (Reference 
Section  3.4) 


6.2  TEST  POINTS 

The  mission-oentered  system  and  time  line  analyses  should 
produce  test  points  at  which  hunan  component  performance  may  be 
measured  directly  or  inferred  reliably  from  equipment  outputs. 
These  test  points  must  be  ranked  in  terms  of  relative  importance 
to  mission  success  and  extent  to  which  assumptions  were  nade  as  to 
operator  capabilities.  (Reference  Sections  3.2  and  3.S) 
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6.3  HUMAN  FACTORS  AND  EQUIPMENT  ENGINEER  COOPERATION 


The  measures  to  be  taken  at  test  points  vail  often  require 
the  objective  recording  of  certain  measurement  parameters.  Often 
these  parameters  will  be  those  being  recorded  by  the  design  engineer 
in  his  evaluation  of  the  system  equipment.  Through  close  coordination 
the  design  and  the  human  factors  engineers  can  use  the  same  test 
equipment  to  obtain  data  peculiar  to  their  own  needs.  The 
cooperative  planning  of  human  factors  and  equipment  tests  is 
reocrmended  as  essential  to  effective  use  of  test  equipment 
and  test  time.  (Reference  Section  3.7) 

6.4  MEASURES  OF  ABSOLUTE  LEVEL  OF  PERFORMANCE 

Although  the  test  measurements  taken  during  a  field  evaluation 
are  intended  to  determine  whether  performance  meets  required 
standards,  it  is  strongly  reocrmended  that  measures  of  absolute 
level  of  performance  also  be  obtained  whenever  practicable.  It  is 
also  recommended  that  complete  descriptions  of  the  parameters  of  the 
equipment  from  which  the  operator  receives  his  inputs  and  into 
which  he  makes  his  outputs  be  given.  The  environmental  and  test 
conditions  under  which  the  measures  were  taken  and  the  level  of 
training  and  experience  of  the  operator  must  also  be  completely 
described.  These  data  are  necessary  for  building  a  pool  of  data 
for  use  during  design  of  future  systems  (Reference  Sections  3.3  and 
4.3.1) 


6.5  TROUBLE  SHOOTING 

Formal  reporting  procedures  and  forms  are  recommended  far 
operator-evaluator  use  in  identifying  operator  errors  or  failure 
to  perform  to  the  required  time  schedule  during  the  test  of 
subsystems  in  which  the  operator  performance  camot  be  measured 
directly  or  inferred  reliably.  (Reference  Section  4.2.2) 

6.6  ADOPTION  OF  ALTERNATIVES  BY  THE  OPERATOR 

Formed  reporting  procedures  and  forms  are  reocrmended  for 
operator-evaluator  use  in  reporting  instances  in  which  it  was 
necessary  for  him  to  adopt  a  procedure  or  technique  different 
from  the  design  procedure  or  technique  in  order  to  successfully 
acocrplish  his  task  or  mission. 

6.7  MISSION  ORIENTED  COCKPIT  EVALUATION 

In  working  with  the  cockpit  mock-up  evaluation,  procedures 
and  check-lists  are  reocrmended  which  are  system-mission  oriented 
in  addition  to  those  which  are  oockpit  evaluation  oriented. 
(Reference  Section  4.3.2) 
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6.8  DYNAMIC  PHYSICAL  MODELS  (SIMULATORS) 


A  dynamic  physical  model  of  the  system  (simulator),  if  made 
available  during  early  design  would  allow  for  evaluation  with  a 
much  more  representative  model  of  the  final  aircraft  than  is 
possible  with  the  static  mock-up.  It  is  recommended  that  the 
advantages  and  disadvantages  of  a  major  simulator  installat:  n  at 
a  Naval  Test  Facility  be  investigated  in  detail.  These 
investigations  should  set  forth  the  pros  and  ccns  of  a  facility 
designed  for  a  single  and  dual  seat  aircraft,  degree  of  flexibility 
required  and  its  applicability  to  design  evaluation  and  refinement 
throughout  the  development  process.  (Reference  Section  4.3.2) 

6.9  ASSIQttENT  OF  NAVAL  PERSONNEL  TO  CONTRACTORS  FACILITY 

It  is  recommended  that  the  mock-up  inspection  procedure  be 
strengthened  by  the  assignment  to  the  contractors  facility  of 
customer  personnel  especially  trained  in  numan  factors  evaluation 
and  experienced  in  prior  systems.  These  personnel  would  work  with 
hunan  factors  personnel  in  defining  the  evaluation  mission,  test 
points  and  evaluative  procedures.  These  personnel,  although 
working  in  an  advisory  capacity  during  design  i  would  become 
members  of  the  mock-up  inspection  team  during  that  evaluation 
(Reference  Section  5.3) 

6.10  ASSIGNMENT  OF  CONTRACTOR  PERSONNEL  TO  FIELD  EVALUATIONS 

It  is  reccmnended  that  contractor  human  factors  personnel 
work  with  Naval  evaluation  personnel  during  field  evaluation  of 
the  system.  Contractor  personnel  would  provide  system  information 
and  assist  in  evaluation  planning.  (Reference  Section  5.4) 

6.11  SPECIALIZED  TRAINING  PROGRAMS 

It  is  strongly  recorrmended  that  special  training  in  hum 
factors  design  and  evaluation  methods  be  provided  to  Navy  personnel 
responsible  for  human  factors  test  and  evaluation.  A  three  to 
four  week  course  after  Navy  Test  Pilot  qualification  is  reccmnended 
as  a  minimum  requisite  but  not  intended  to  substitute  for  formal 
University  training  in  Hunan  Factors. 

It  is  also  reccmnended  that  evaluation  personnel  whose 
speciality  and  background  of  training  is  in  Human  Factors  but  who 
do  not  have  pilot  and/or  operational  experience  be  given  in¬ 
doctrination  tours  with  the  fleet  to  observe  the  operation  of 
similar  systems  and  to  discuss  operational  tactics  and  problems 
with  fleet  personnel.  (Reference  Section  5.5) 
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7.0  APPLICATION  OF  CONCLUSIONS 


This  study  of  the  hunan  factors  field  evaluation  problem  has 
resulted  in  conclusions  and  reccmmndat ions  deduced  from  the 
general  literature,  Navy  docunents,  examination  of  case  histories, 
interviews  and  the  investigator's  own  experience  with  the  problem. 
These  conclusions  and  recoranendotions ,  to  be  useful  to  the  Navy, 
need  demonstration  of  their  workability  within  the  field  test  and 
evaluation  context.  To  this  end  the  next  step  in  this  investigation 
will  be  the  trial  of  certain  of  the  reocnmndaticns  during  the 
field  test  of  a  specific  system. 

The  system  within  which  reocranendations  will  be  tested  is  the 
P-3C  aircraft.  The  system  analyses  will  be  examined  to  determine 
test  points  and  their  priorities.  Performance  measures  and  methods 
of  data  recording  will  be  specified  and  data  reoording  will  be 
undertaken  in  the  field.  From  these  activities  the  details  of  the 
field  test  procedure  will  be  evolved  and  conclusions  and 
reccmnendaticns  will  be  modified  as  necessary. 
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APPENDIX  A 


Reference  Table  of  Measurement  Parameters  Applicable  to 
Operator  Performance  Evaluation  in  Aerospace  Croft. 

(For  a  detailed  discussion  of  measurement  parameters  see 
text,  Section  3.7,  Pages  14  -  19). 


ABOUT  AXIS  CONTROL 
(Attitude) 


ALONG  AXIS  CONTROL  "SYSTEM  STATE"  CONTROL 

(Navigation) 


Pitch  angle 

Roll  angle 

Yaw  or  heading  angle 

Pitch  rate  (q) 

Roll  rate  (p) 

Yaw  rote  (r) 

Control  position 
Control  rate 


Z  Axis  position  (alt.) 

X  axis  position 
Y  axis  position  • 
Vertical  rate  (Z)  . 

Longitudinal  ra$e  (X) 
Lateral  rate  (Y) 
Vertical  acceleration  (£) 
Lateral  acceleration  (t) 


Proper  control  position  and 
display  reading  and  execution 
of  proper  procedures  for 
subsystems: 

Weapons 
Corramicaticns 
Hydraulic 
Environmental 
Electrical 
Flight  Controls 
Qagine  and  related 
systems 
Fuel 
Radar 
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APPENDIX  B 


This  appendix  contains  the  suggested  outline  of  a  course 
designed  for  field  test  personnel  whose  particular  interest  and 
concern  is  human  factors  testing.  The  course  is  intended  to  cover 
items  essential  to  an  overall  orientation  in  human  factors  test 
and  evaluation  and  to  provide  specific  guidance  in  methods  and 
techniques  necessary  to  effective  evaluation. 

In  this  course  outline  a  differentiation  is  made  between 
field  research  and  field  evaluation  with  similarities  and 
differences  drawn  between  them.  While  the  primary  aim  of  the 
course  is  to  provide  guidance  for  field  evaluation,  points 
relevant  to  field  research  are  given  in  order  that  the  human  factors 
evaluation  nwy  be  sensitive  to  and  have  some  guidance  in  field 
research  techniques  should  the  opportunity  arise  to  apply  them 
during  a  field  evaluation.  His  motive  for  doing  so  would  be  the 
collection  of  data  under  prescribed  conditions  for  entry  into  the 
general  human  factors  data  bank. 


COURSE  OUTLINE 

1.  Distinction  between  field  research  and  field  evaluation. 

A.  Field  Research 

1)  More  opportunity  to  identify  and  control  the  parameters 
and  variables  of  interest. 

2)  Usually  collecting  normative  data. 

3)  May  be  testing  to  determine  how  well  system,  subsystem  or 
oorponent  meets  seme  set  or  required  level  of  performance. 

4)  More  opportunity  to  introduce  performance  measures  and  special 
instrumentation  to  obtain  those  measures. 

5)  More  flexibility  in  changing  procedures  and  equipment  as 
testing  progresses  to  achieve  the  desired  goals  of  testing. 

B.  Field  Evaluation 

1)  Nearly  always  testing  to  determine  whether  system  performance 
comes  up  to  some  specified  level  of  desired  performance. 

2)  Nearly  always  there  is  a  limitation  an  time  within  which  the 
evaluation  is  to  be  performed. 

3)  Must  fit  me  as  urts  of  nan's  performance  into  the  tests  of 
equipment. 
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4)  Heed  diagnostic  measures  to  isolate  those  points  at  which 
performance  has  failed  to  meet  criteria  when  total  system 
performs  below  criterion  level. 

5)  Little  or  no  opportunity  to  vary  independent  variables 
systematically. 

2.  General  setting  within  which  field  research  and  evaluation  are  conducted 

A.  Some  fixed  time  span. 

B.  Testing  both  equipment  and  men. 

1)  Same  tests  peculiar  to  equipment  alone,  e.g. .  how  it  functions 
under  the  field  conditions. 

2)  Some  tests  peculiar  to  nan  alone,  e.g.,  his  physiological 
state  under  the  field  condition. 

3)  Some  tests  peculiar  to  the  interaction  between  men  and 
equipment,  e.g.,  how  well  man  can  operate  or  maintain 
equipment  under  the  field  conditions. 

C.  At  present  human  factors  testing  often  goes  "piggy  back"  on 
equipment  testing  —  or  must  be  fitted  in  and  around  equipment 
testing.  This  is  a  fact  of  human  factors  testing  which  must 
be  recognized  in  setting  up  human  factors  research  and 
evaluations  in  the  field.  This  situation  is  not  to  be  construed 
as  all  bad.  A  great  deal  of  hunan  factors  data  collecting  can 
be  carried  out  in  conjunction  and  simultaneous  with  equipment 
testing.  It  is  the  efficient  and  many  times  the  most  appropriate 
thing  to  do.  However,  it  will  be  neoessary  at  times  to  program 
into  the  testing  schedule  specific  b lodes  of  time  for  collecting 
human  factors  data  independent  of  tests  on  any  given  piece  of 
system  equipment. 

3.  The  requirement  for  thorough  knowledge  of  the  system  under  test, 
its  operation  and  its  operating  environment. 

o  To  isolate  and  define  the  inport  ant  parameters 

which  nay  influence  performance. 

o  To  set  up  methods  of  either  controlling  or 
systematical ly  varying  these  parameters. 

o  To  determine  what  measures  are  appropriate  and  at 

what  test  points  they  wi'  I  be  taken, 

4.  Methods  for  determining  neoessary  system  details. 

A.  Data  flow  analysis. 

1)  What  inputs  are  (or  must  be)  received  by  each  component  or 
subsystem  and  what  outputs  are  (or  must  be)  made  to  the 
next  oonponent(s)  or  subsystems). 
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2)  For  nan  as  an  information  processor  we  are  interested  in  the 
following  general  types  of  information:  a)  what  information 
nust  he  receive  or  is  the  system  designed  for  him  to 
receive?  b)  what  transformations  of  the  information  must 

he  make?  c)  what  "informational"  outputs  nust  he  make  to 
other  oonpcnents  of  the  system  either  man  or  machine? 

3)  Nomenclature  or  symbology  used  is  not  necessarily  inpartant 
however  those  which  may  be  used  are. 

B.  Time  Line  Analysis  (serves  both  as  a  means  of  learning  the 
system  and  evaluating  it  -  also  serves  to  evaluate  the  adequacy 
and  practicality  of  the  test  and  evaluation  proposed  to  be 
carried  out.) 

1)  Segment  overall  program  or  mission  into  phases  which  can  be 
bounded  in  time  by  externally  imposed  limits. 

2)  Detail  actions  and  procedures  to  be  executed  within  each 
segment. 

3)  Assign  "time  to  perform"  to  each  action  or  procedures 
obtaining  times  either  from  available  data,  estimates  or 
from  data  (taken  in  mock-up  simulator  or  actual  system.) 

4)  Ccnpare  actual  times  to  complete  segments  to  time  required 
as  given  by  program  or  mission  requirements  given  in  1. 

C.  A  list  of  techniques  for  system  operational  description  and  a 
exposition  of  the  Operational  Sequenoe  Diagram  is  given  in  "Hunan 
Factors  Design  Standards  for  the  Fleet  Ballistic  Missile  Weapon 
System:  NAVWEPS  OD  18413A,  Vol.  1,  Pages  50-60. 

5.  Some  general  principles  of  sound  research  and  evaluation  procedure  for 

which  to  aim. 

A.  Standardization 


1) 

Test  oonditions. 

2) 

Performance  measures. 

3) 

Observers. 

4)  Environmental  effects. 

Control  or  assessment  of  relevant  variables. 

1) 

If  variable  cannot  be 
at  the  time  dependent 

controlled  it  should  be  measured 
variables  are  measured. 
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C.  Explicit  statement  of  independent  and  dependent  variables  with 
no  variation  in  their  meaning  or  method  of  measurement. 

D.  Dependent  variables. 

1)  These  are,  in  effect,  precise  statements  of  what  we  want  to 
know  about  how  the  system  or  the  man  within  the  system 
performs. 

2)  In  field  research  we  may  term  these  questions  the  dependent 
variables.  In  field  evaluation  we  may  term  them  the 
criteria  against  which  we  measure  system  or  man's  performance. 

3)  Must  order  dependent  variables  in  order  of  priority  of 
importance  since  and/or  time  is  usually  limited. 

4)  Must  decide  upcn  points  in  system  at  which  measurements  will 
be  taken. 

5)  Must  determine  the  method  of  measurement,  i.e.,  direct 
recording,  direct  observation,  rating  scales,  questionnaire, 
etc. 

6)  Always  detail  in  advance  the  method  of  data  reduction, 
analysis  and  presentation.  Shotgun  approach  to  data 
collection  is  not  feasible  in  field  situations. 

7)  General  classes  of  dependent  variables. 

a)  Man's  outputs 

a.  Time  to  perform. 

b.  Accuracy  of  performance . 

b)  Man's  inputs. 

a.  Standard  deviation  of  control  displacements  (or 
pressures)  about  the  mean  control  position. 

b.  Power  spectrun  or  autocorrelation  function. 

c)  Man's  physiological  state. 

d)  Man's  psychological  state. 

E.  Independent  Variables. 

1)  In  field  research  may  have  the  opportunity  to  assess,  control 
or  vary  systematically  the  independent  variables.  In  field 
evaluation  usually  have  opportunity  only  to  assess  these 
variables . 
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2)  Fran  system  and  task  analysis  identify  and  describe  both 
system  and  environmental  parameters  like  to  affect 
performance. 

3)  Field  research  will  normally  be  set  up  for  the  test  of  the 
effects  of  specified  parameters  upon  performance. 

4)  General  normative  data  vs.  data  specific  to  a  system  influence 
selection  of  independent  variables. 


6.  Criteria 

o  Implies  some  value  judgment  as  to  the  "goodness"  of 
the  performance. 

o  Measurement  per  se  does  not  provide  value  judgments. 

o  These  value  judgments  must  be  expressed  in  terms  of  the 
desired  purposes  or  the  mission  of  the  system. 

A.  Ultimate  vs.  Actual  Criteria 

1)  Seldom  possible  to  obtain  direct  measures  of  the  ultimate 
criteria. 

2)  Usually  necessary  to  select  sane  actual  (intermediate)  criteria. 

3)  Must  then  use  these  actual  (intermediate)  criteria  in 
evaluating  performance. 

4)  There  is  no  certain  method  for  specifying  the  actual 
criteria. 

5)  Sources  of  error  in  selecting  actual  criteria. 

a.  Unreliability. 

b.  Irrelevancy  -  the  lack  of  relation  to  ultimate  criterion. 

c.  Contamination  -  ingredients  in  the  actual  criteria 
which  do  not,  in  fact,  oonprise  the  ultimate  criterion. 

d.  Distortion  -  errors  arising  from  assigning  incorrect 
weights  to  the  separate  factors  that  comprise  the 
actual  criteria. 

B.  Establishing  Valid  Criteria. 

1)  No  established  procedures. 

2)  Recognizing  importance  of  criteria  selection  and  types  of 
errors  which  might  be  present  are  good  starting  points. 
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3)  Steps  which  should  lead  to  more  useful  and  relevant 
criteria. 

a.  Define  the  activity  -  specify  to  extent  possible  the 
activity  desired  for  successful  and  proficient 
performance, 

b.  Analyze  the  activity  -  consider  the  activity  in  terras  of 
purposes  or  goals,  behavior  and  skills  involved,  their 
relative  importance  and  standards  of  performance  expected. 

c.  Define  proficient  and  successful  performance. 

d.  Develop  sub-criteria  to  measure  each  element  of  success. 

e.  As  appropriate  develop  a  combined  measure  of  successful 
performance. 

C.  Combining  Actual  Criteria. 

1)  Often  necessary  that  several  criteria,  all  of  which  are 
relevant  for  a  particular  activity  be  used.  In  such  case  it 
may  be  desirable  to  combine  them  into  a  single  comprehensive 
one. 

2)  Combining  will  usually  involve  assigning  relative  wei^its 
to  the  individual  criteria. 

3)  Rules  for  combining  criteria. 

a.  Weight  in  accordance  with  their  relevance  to  the  ultimate 
criterion. 

b.  Criteria  which  repeat  or  overlap  factors  in  other 
criteria  should  receive  low  weitfvt, 

c.  Other  things  being  equal  the  more  reliable  criteria 
should  receive  more  weigit. 

4)  Caution  must  be  exercised  in  applying  weights  to  raw  score 
values  -  use  standard  scores, 

7.  Measurement  of  Performance, 

A.  What  to  measure. 

1)  Should  be  preceded  by  an  explicit  statement  of  the  research 
questions  being  asked. 

2) -  Even  though  cne  is  carrying  out  field  research  only  to 

obtain  normative  data,  the  experimental  questions  are 
couched  within  the  framework  of  seme  background  knowledge 
of  systems  and  tasks  about  which  data  is  needed  to 
determine: 
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a)  Whether  nan  can  perform  certain  tasks  and  to  what 
level  of  performance. 

b)  How  well  the  s/stem  would  f metier  if  man  were  placed 
as  a  oonpanent  within  the  system. 

3)  Need  analyses  of  systems  which  allow  statement  of  the  critical 

tasks  within  those  systems  or  classes  of  systems  about  which 

data  is  needed. 

4)  Identification  of  critical  tasks. 

a)  What  is  measured  depends  cn  purpose  so  procedures  and 
emphasis  will  be  different  for  research,  for  training 
and  for  proficiency  evaluation. 

b)  Not  necessary  or  feasible  to  measure  everything  -  rather 
measure  a  sample  of  behavior.  Shotgun  approach  to 
measurement  not  desirable. 

c)  As  a  rule,  select  for  measurement  those  tasks  cn  which 
good  performance  leads  to  mission  success  and  poor 
performance  leads  to  mission  failure. 

d)  In  identifying  and  defining  tasks  and  mission  segments 
critical  to  system  performance  start  with  a  descriptive 
analysis  on  a  time  line  basis  of  the  tasks  that  make 

up  system  operation. 

e)  In  identifying  critical  tasks  asking  the  following  questions 
with  respect  to  the  tasks  is  helpful. 

Would  belcw-minimum  performance ; 

.  lead  to  an  accident? 

.  result  directly  in  mission  failure? 

.  be  inpossible  to  remedy  within  the  time 
constraints  or  not  at  all? 

.  be  difficult  to  detect  because  of  inadequate 
information  feedback? 

.  recur  over  time  in  such  a  wav  as  to  produce  a 
cumulative  effect? 

.  contribute  a  large  proportion  of  time  to  the 

total  time  required  for  some  larger  and  critical 
function? 
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f)  Emphasis  to  be  placed  an  individual  activities  and 
crew-conducted  activities. 

1)  Should  be  detennined  fran  an  analysis  of  the  relative 
importance  of  the  two  within  the  system. 

2)  Will  depend  on  the  stage  of  learning  -  ordinarily, 
individual  activities  are  stressed  in  earlier 
learning  stages  whereas  crew  functions  are  emphasized 
in  terminal  phases  of  training. 

B.  Levels  of  Measurement 

1)  Nominal  Scale. 

a.  Provides  identity  only. 

b.  Assigns  unit  or  event  to  class  or  set. 

2)  Ordinal  Scale 

a.  Provides  both  identity  and  order. 

b.  The  units  are  assigned  a  rank  order. 

c.  Does  not  ocnnote  quantitative  measurement  as  such  but 
rather  a  judgment  of  the  amount  possessed  by  the  units 
involved. 

3)  Interval  Scale. 

a.  Provides  identity,  order  and  additivity. 

b.  Units  are  scaled  in  equi -distant  terms. 

c.  Intervals  between  quantities  are  equal. 

d.  Zero  point  is  arbritrar*  and  does  not  designate  complete 
absence  of  the  property. 

4)  Ratio  scale. 

a.  Provides  identity,  order  and  addivity  with  reference 
to  an  absolute  zero. 

b.  An  extension  of  the  interval  scale  with  a  natural 
absolute  zero  point. 

C.  Purpose  of  Measurement. 

1)  Prediction  of  future  sucoess, 

2)  Evaluation  of  present  performance. 

3)  Evaluation  of  learning  rate. 
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4)  Identification  of  areas  of  strength  and  deficiency. 

5)  Evaluation  of  training  effectiveness. 

6)  Selection  and  placement  of  individuals  and  teams. 

7)  Refinement  of  criterion  information. 

8)  Definition  of  requirements. 

D.  Overall  vs.  Diagnostic  Measures 

1)  Over-all  Measures . 

a)  Global  indices  of  sub-system  or  system  performance 
associated  with  mission  segments  or  complete  mission. 

b)  Useful  in  assessment  since  it  is  descriptive  of  some 
end  result  which  can  be  compared  with  the  standard. 

c)  Weak  in  analytic  sense  since  they  provide  no  detailed 
information  on  performance  beyond  the  outputs  sampled. 

2)  Diagnostic  Measures. 

a)  Quite  specific,  identifying  elements  of  job  performance 
in  specific  skill  areas. 

b)  Since  they  are  concerned  with  smaller  more  precisely 
defined  units  of  behavior  they  lend  themselves  more 
readily  to  objective  valid  measurement. 

c)  Those  relevant  to  criterion  performance  (predictive) 
are  more  useful  in  differentiating  among  operators 
or  crews  since  terminal  measures  often  contain 
uncontrolled  variables. 

E.  Accuracy  of  Measurement. 

1)  Refers  to  how  close  the  obtained  value  or  measure  is  to  be 

true  value. 

2)  There  is  no  single  way  to  assure  measurement  accuracy  - 

accuracy  may  be  improved  by  the  following  means. 

a)  Increase  scope  of  measurements  to  be  taken  -  include 
additional  aspects  of  relevant  behavior. 

b)  Increase  the  nurtber  of  observations  on  vtfuch  means, 
etc.  are  based. 
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c)  Control  the  conditions  under  which  measurements  are  taken. 

1)  Define  those  factors  present  and  held  constant 
and  those  factors  to  be  varied. 

2)  Maintain  conformity  of  conditions  throughout 
period  of  measurement. 

3)  Insure  that  the  measures  are  taken  correctly. 

F.  Reliability  of  Measurement . 

1)  Definition  -  agreement  or  consistency  of  measures  from 

repeated  observations. 

2)  Must  have  high  reliability  to  obtain  validity  -  may  have  high 
reliability  but  no  validity. 

3)  Absolute  expression  of  reliability. 

a)  Standard  error  of  measurement  -  specifies  the  limits 
within  which  an  obtained  value  mav  be  expected  to  vary. 

SE  =  s  /i  -  r 

where:  s  =  the  standard  deviation  of  the  measures 

r  *  the  reliability  of  the  measures 
expressed  in  terms  of  a  correlation 
coefficient. 

It  is  interpreted  in  a  manner  similar  to  the  SD. 

4)  Relative  measures  of  reliability. 

a)  Expressed  in  terms  of  correlation  coefficient. 

b)  Coefficient  of  internal  consistency,  i.e.,  split  half  method. 

c)  Coefficient  of  stability  -  extent  to  which  measurements 
agree  over  a  Deriod  of  time.  Correlation  between  measures 
taken  at  identical  observation  points  with  an  intervening 
time  lapse. 

d)  Coefficient  of  equivalence  -  correlation  between  two 
different  measures  which  are  known  or  presumed  to  be 
equivalent . 

e)  Peroent  agreement  of  values  taken  during  repeated 
observations. 
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G.  Validity  of  Measurement. 

1)  Definition  -  degree  to  which  measuring  instruments  measure 

what  they  are  intended  to  measure. 

2)  Four  types  of  validity. 

a)  Content  validity  -  logical  validity  based  on  expert 
opinion  or  other  logical  considerations. 

b)  Concurrent  validity  -  statistical  validity  -  correlation 
with  other  task  or  dimension  external  to  the  measurement. 

c)  Predictive  validity- statist  iced  correlation  between 
obtained  measures  and  future  states  on  some  task  or 
dimension  external  to  the  measurement. 

d)  Construct  validity  -  logical  validity  -  where  the  emphasis 
is  on  the  trait,  quality  or  ability  presumed  to  underlie 
the  measures  being  taken. 

H.  The  question  of  Quantitative  (Objective)  vs.  Qualitative 

(subjective)  Measures. 

1)  Objective  Measures. 

a)  Generally  permit  measurement  relatively  independent  of 
the  observer. 

b)  Generally  of  higher  reliability  than  subjective. 

c)  Greatest  objectivity  obtained  by  means  of  recording 
instruments  etc.  where  a  permanent  record  of  behavior 
is  obtained  at  the  time  of  occurrence. 

d)  Insistence  upon  ccrplete  objectivity  tends  to  result 
in  omission  of  a  variety  of  critical  job  components 
because  of  inability  to  measure  them  objectively, 

e)  Can  result  in  impractical  gad ge  try  and  procedures. 

f)  Relatively  free  from  observer  bias. 

2)  Subjective  Measures. 

a)  Generally  dependent  upon  the  characteristics  of  the 
observers  -  may  introduoe  bias. 

b)  Inter-observer  reliability  not  always  hi#i. 

c)  More  flexibility  in  administration. 
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3)  Ratings  (a  form  of  subjective  measurement) 

a)  Because  of  reliability  considerations  ratings  should 
be  reserved  for  use  in  those  instances  where  other 
measures  are  not  feasible. 

b)  Rating  procedures. 

1)  Rating  scales  -  rater  makes  judgment  on  scale  of  defined 
categories. 

a)  Coooer  rating  scale. 

b)  Cornell  Aeronautical  Lab.  use  of  Cooper  scale. 

c)  LSI  rating  scales  used  in  UDOFTT  sijrnulator  studies. 

d)  LSI  rating  scales  used  in  attitude  sfodies . 

2)  Comparative  systems  -  pair  people  or  units  with  respect 
to  each  other. 

a)  Comparison  between  pairs. 

b)  Card  sorting  technique. 

3)  Check  lists  -  judgments  by  raters  as  to  which  of  a  series 
of  descriptive  terms  either  are  cr  are  not  applicable 

to  the  units  being  evaluated. 

4)  Critical  incidents  -  recording  actual  incidents  as 
behaviors  which  are  especially  effective  or  ineffective 
in  the  accomplishment  of  the  mission  -  Standing  of 
unit  is  indicated  by  frequency  of  occurrence  of 
reported  incidents. 

c)  The  sources  of  bias  in  ratings. 

1)  Halo  effect. 

2)  Leniency  error. 

3)  Error  of  oentral  tendency. 

4)  Contrast  error  -  tendency  to  rate  in  opposite 
direction  on  a  dimension  from  how  the  raters  see 
themselves. 

5)  Proximity  error  -  tendency  for  ratings  to  be  more 
related  when  made  close  to  each  other  in  time. 


62 


I.  Individual  vs.  Crew  Performance. 

1)  Question  of  what  is  "crew  coordination"  remains  unanswered. 

2)  Groups  have  been  studied  from  several  conceptual  viewpoints. 

a)  Emphasis  on  group  structure  to  understand  implications 
of  formal  structure  on  group  effectiveness. 

b)  Group  dynamics  approach  in  context  of  a  social  systems 
with  emphasis  on  the  role,  status  and  other  factors 
which  differentiate  individuals  within  the  group  and 
influence  group  effectiveness. 

c)  Group  as  a  single  nan-machine  system  the  effectiveness 
of  its  decisions  and  actions  being  shown  by  response 
adequacy,  sequences  of  nerformanae,  and  timeliness  of 
behavior.  These  are  determined  by  how  crew  members 
interact  with  each  other  and  with  equipment  and  the 
manner  in  which  comm  cat  ions  among  crew  members  is 
achieved  with  regard  to  system  output. 

3)  Group  as  a  man-machine  system  is  the  approach  to  be  taken 

in  Hunan  Factors  field  evaluation  of  Navy  systems  being 

considered  here. 

4)  Crew  performance  must  be  regarded  as  more  than  the  sum  total 

of  the  individual  performances. 

5)  Measures  of  crew  performance. 

a)  Synchronization  of  action. 

b)  Response  improvisation. 

c)  Amount  of  time  spent  interacting  -  good  crew  should  reduce 
individual  interaction  to  a  minimum  so  that  more  effort 

is  devoted  to  the  job  and  less  to  coordinating. 

d)  Amount  of  corrrunicaticn  -  the  less  the  communication  the 
higher  the  degree  of  coordination. 

e)  Freedom  for  interpersonal  canruni  cat  ions. 

f )  Monitoring  and/or  naking  sane  responses  for  another  crew 
member. 

g)  Aiding  in  the  detection  of  out-of -tolerance  conditions. 

h)  Sharing  of  risk  activities  among  crew  members. 
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J.  Points  at  which  to  measure. 

1)  Points  of  interaction  between  nan  and  machine  at  which 
inadequate  performance  significantly  affects  the 
accomplishment  of  the  mission. 

8.  Procedural  Steps  in  Assessment  of  Performance. 

A.  Conduct  thorough  analyses  of  the  task. 

B.  Identify  important  and  critical  aspects  of  the  task  and 
the  environment. 

C.  Define  performance  requirements  of  the  task  as  appropriate. 

D.  Select  test  points  and  measures  appropriate  to  the  behavior 
to  be  evaluated. 

E.  Determine  conditions  under  which  measures  will  be  taken. 

F.  Determine  techniques  for  obtaining  measurement  data  and  for 
combining  measures  as  appropriate. 

G.  Specify  methods  of  data  analysis. 

9.  Subjects  or  Operators  upon  which  Data  is  Collected  (Subject  Sampling) 

A.  Sample  size. 

B,  Sample  composition  with  respect  to  experience  and  other  factors 
and  its  relation  to  purposes  of  measurement  and  usefulness  of 
the  data. 

10.  The  Subject  Task  and  Environmental  Conditions  (Object  Sampling) 

A.  Their  relation  to  purposes  of  measurement. 

B.  Delineation  of  critical  and  relevant  task  and  environmental 
parameters. 

C.  Specification  and  quant  it  ificat  ion  for  manipulation  as  independent 
variables,  for  experimental  control,  or  for  assessment  of  the 
effect  of  their  variation. 

11.  Data  Collection  and  Treatment. 

A.  Experimental  methods . 

1)  Single  vs.  multi-variate  designs. 

A)  Single  variable  experimentation. 
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b)  Multi-variate  design. 

(1)  Each  subject  his  own  ccrtrol. 

(a)  Appliciability  for  oertain  tasks  but 
limitations  of  learning,  practice  and 
other  experience  in  antecedent  conditions 
of  the  experiment. 

(b)  Advantage  is  requirement  for  s Trailer  total 
number  of  subjects. 

(2)  Independent  groups. 

(a)  Mo  interactive  effect  from  oell  to  cell  in 
the  design. 

(b)  Subjects  a  problem  particularly  in  field 
research. 

B.  Specific  measures  of  performance. 

1)  Procedural  tasks. 

a)  Time 

b)  Accuracv 

2)  Closed-loop  tracking  (Compensatory  and  pursuit) 
a)  Accuracy. 

2  2 

(1)  Integrations  of  error  (  /e,/|e|  ,/  e  ,/  (x-x),/  (x-x)  ) 
(Variability  vs.  average  error) 

(2)  Number  of  crossings. 

(3)  Time  cn  target. 

(4)  Frequency  of  catastrophic  errors. 

3)  Crew  coordination. 

a)  Overall  gross  measure  of  crew  output  in  terms  of: 

(1)  Time  to  acocrplish  task, 

(2)  Accuracy  in  acocnplishing  task. 

b)  Number  of  ccnrnunicatians  between  members. 

(1)  Voice 

(2)  Other  signals. 
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4)  Decision  Making  Tasks. 

a)  Time 

b)  Accuracy 

5)  Perceptual  and  Motor  Skills 
a)  Psychophysical  measures. 

C.  Sanple  size  and  its  relation  to  experimental  method  and  data 
treatment. 

1)  Snail  number  of  subjects  with  many  measurements  per 
subjects  for  psychophysical  data. 

2)  Small  sanple  statistics  -  N  =  30 

3)  Statistical  significance  vs  practical  significance. 

4)  Sample  size  and  single  variable  experiments. 

5)  Sample  size  in  milt i- variate  experiments. 

a)  Independent  groups  -  N  large  enough  to  assume  rondonness 
or  representativeness. 

b)  Subject  his  own  control. 

D.  Parwneteric  -  normal  probability  surrmary  statistics. 

1)  Measures  of  central  tendency  -  (Mean,  Median,  Mode) 

2)  Measures  of  deviation  from  standard  (CE). 

3)  Measures  of  variability. 

a)  About  a  standard,  (PMS ,  average  deviation) 

b)  About  mean  (SD,  Average  deviation) 

c)  About  median  (Semi- interquart  ale  and  quartile  range) 

d)  Range 

e)  Percentiles 

f)  Conpanable  scores  (z  scores,  Stanines). 

4)  Measures  of  correlation. 

a)  Product-moment  correlation 

b)  Percent  argument  (%  =  r^) 
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5)  Tests  of  reliability  of  differences. 

a)  t  test  (means,  percentages) 

b)  chi  square 

c)  Fiducial  limits 

6)  Ncn- Parametric  Statistics 

a)  Difference  frcm  parametric  statistics. 

b)  Tests  of  reliability  of  differences. 

(1)  Wilcoxon  signed  ranks  tests  (related  samples) 

(2)  Mann-Whitney  U  Test  (independent  samples) 

c)  Test  of  correlation. 

(1)  Spearman  rank  correlation  <rs> 

(2)  Kendall  rank  correlation  coefficient  (t) 

E.  Presentation  of  Results. 

1)  Graphs 

2)  Bar  Graphs  (Histograms) 

3)  Significant  differences  in  terms  of  probability. 
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Unclassified _ 

_ Security  Clarification 


13  ABIT* AC  T 

This  study  examined  the  problem  of  hvjrmn  factors  field  evaluation  within  the 
Navy  far  the  purpose  of  formulating  reocnnendations  for  improved  evaluative  methods 
and  techniques.  Conclusions  and  reoonaendations  were  drawn  from  reviews  of 
relevant  general  literature  and  Navy  documents,  interviews  and  discussions  with 
individuals  experienced  in  human  factors  evaluation,  and  an  examination  of  the 
design  and  evaluation  procedures  carried  out  during  the  development  and  test  of  a 
specific  Navy  aircraft  system. 

Conclusions  and  recommendations  are  enumerated  in  Section  6.0  of  the  report 
along  with  references  to  the  relevant  supporting  Section  (s)  within  the  body  of  the 
report. 

Briefly,  it  was  concluded  that  human  factors  evaluation  does  net  receive 
emphasis  or  support  comparable  to  that  given  equipment  evaluation  or  exxitnens urate 
with  the  importance  of  toe  hunan  operator  to  the  successful  functioning  of  the  system. 
Much  more  definitive  and  timely  information  mist  be  provided  the  human  factors  field 
evaluator,  evaluations  mist  be  more  mission  oriented  rather  than  cockpit  centered, 
toe  role  of  the  mock-up  inspection  needs  redefinition,  assignment  of  trained  Navy 
human  factors  personnel  to  advise  and  assist  the  ocn  tract  or  during  development  is 
reooonended  as  is  assignment  of  contractor  hunan  factors  personnel  during  field 
evaluations,  close  oooperotion  between  human  factors  and  equipment  design  evaluation 
personnel  during  field  evaluation  will  greatly  increase  toe  effectiveness  ha 
evaluation,  and  a  short  intensive  training  course  in  human  factors  eva* 
problems  and  methods  is  reooonended  for  Navy  personnel  assigned  to  Bye  Aluatian . 
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